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No dishonour in depression 


The stigma associated with mental illness discourages investment in finding cures — even though 


the burden of the disorders on society is immense. 


clinical depression. In her book published last week, Sane New 

World (Hodder & Stoughton, 2013), she describes her struggles 
with different therapies and her fear of being ‘found out. She is not alone. 
A 2010 survey in Europe revealed that 38% of people had a diagnosed 
mental disorder — including 7% with major depression. The proportion 
is likely to be similar in all populations, even in Africa, where psychiatric 
disease barely features on the health agenda. 

The stigma attached to such disorders means that many people do 
not admit to their illness. The same stigma discourages investment, so 
that research funding is not proportional to the distress these disorders 
cause. Why lobby for better treatments for depression or schizophrenia 
when there are ‘real’ diseases out there, such as cancer? 

Wax has been through the catalogue of available therapies and says 
that she has settled on an approach known as ‘mindfulness; which 
helps to keep her depression under control. It may seem that the vari- 
ous therapies are inadequate, given that initial treatment of depression 
fails in 60% or more of cases. It is true that more treatment options 
are badly needed. Yet evidence-based cognitive behavioural therapies 
and drugs already developed by the pharmaceutical industry can work 
splendidly for long periods — if they are given to the right patients. 

How do you recognize the right patients? Treatment decisions tend 
to be based on the preferences of physicians or their patients, often with 
a missionary zeal that gives no credence to the idea that a personalized 
approach would be more appropriate. Some hold that drugs have unac- 
ceptable side effects, whereas others say that cognitive therapy wastes 
time if the depressed brain is not first chemically lifted. It is becoming 
increasingly common to offer patients both treatments at once in the 
belief that drugs can prepare the brain to respond to cognitive therapy. 
That may be so, but it is also possible that the improved response rates 
are simply the result of catching two different populations. 

The situation would improve drastically if simple tests could be 
developed to predict treatment outcome. Many exploratory clinical 
trials are now under way to search for biomarkers in genes or in the 
brain itself that might be predictive. This week sees the description of 
the first potential biomarker for discriminating between responders 
and non-responders to drugs or cognitive therapy in major depres- 
sive disorder (C. L. McGrath et al. JAMA Psychiatry http://dx.doi. 
org/10.1001/jamapsychiatry.2013.143; 2013). 

The study, led by neurologist Helen Mayberg of Emory University 
in Atlanta, Georgia, used positron emission tomography (PET) scans 
to measure metabolic activity in various brain regions of people with 
untreated depression (see also Nature http://doi.org/mtc; 2013). Patients 
were randomized into groups and treated for 12 weeks with either a 
commonly used antidepressant drug or cognitive behaviour therapy. 
The study’s results were clear-cut. Below-average activity in a brain area 
called the right anterior insula — which is linked with depression-rele- 
vant behaviours such as emotional self-awareness and decision-making 


(inst and writer Ruby Wax, a regular on British television, has 


— was associated with the patient showing a good response to cognitive 
behavioural therapy and a poor response to the drug. Above-average 
insula activity was predictive of the opposite. 

This potential biomarker must still be tested in prospective clinical 
trials, which will assign patients to a treatment on the basis of their 
insula activity. It may fail. But if the biomarker comes up trumps, it 
could be transformative for many patients 


“The stigma who would not have to endure two or three 
attached to months of treatment trial and error. 

mental disorders If attitudes to mental illness do not change, 
will fade when even a successful biomarker of this type will 
treatment havea hard time being accepted by health sys- 
becomes more tems that foot the bills. Unlike a simple blood 
effective. test, a PET scan is inconvenient because not 


all physicians have easy access to the technol- 
ogy and, at up to US$2,000 a shot, the procedure is not cheap. Although 
expensive treatments for other diseases and arguments about how to 
fund them are nothing new, this rational debate is harder for mental 
illnesses because of the irrational stigma that is attached to them. 
Fifty years ago, the stigma surrounding cancer meant that physicians 
would sometimes lie to patients about the diagnosis from kindness. 
That has now faded because cancer is not always the death sentence it 
once was — thanks in part to the development of biomarkers that guide 
therapy. The stigma attached to mental disorders will also fade when 
treatment becomes more effective. But to break out of a vicious circle 
of underinvestment in a stigmatized disease area will require continued 
effort to get the problem recognized. This is a good week for that. m 


Track the trackers 


Oversight and public debate about access to 
personal data are crucial to preserving privacy. 


following media revelations of massive snooping by US intel- 

ligence agencies. Collecting phone records is in itself nothing 
new, and is legitimate — scientists have long done so for research. 
What is unprecedented is the government weaving together multiple 
huge data sets for secret state surveillance. 

When researchers obtain electronic records of millions of people's 
calls for social science — including data on who called whom, when 
and from where, but not actual conversations — they must first make a 
strong case for why they need that information, and must comply with 
multiple layers of oversight and safeguards such as anonymization of 


A ccess to phone records is all over the news at the moment, 
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numbers. The same goes for research use of health records or any other 
private personal data. And there is a good reason: to protect privacy. 
Privacy concerns are at the heart of the uproar over how the US 
National Security Agency (NSA) has secretly required telephone 
companies to hand over similar phone records on almost every US 
resident. The US government is also vacuuming up billions of e-mails 
and other Internet communications from traffic outside the United 
States — all in the name of law enforcement and the war on terror. 
What is perhaps most concerning, apart from the mind-boggling 
scale of the snooping, is that until last week, the very existence of these 
programmes was secret. Since the revelations, US President Barack 
Obama has defended this secrecy, on the grounds that if terrorists 
knew that the government was monitoring phones and the Internet, 
they would seek ways around the surveillance. But most terrorists 
probably take that as a given and — unlike most ordinary citizens — 
already use encryption and other techniques to secure and obfuscate 
communication. It isa poor excuse for a lack of transparency and 
public oversight of such snooping. Obama asked Americans to trust 
the government, but history shows that ‘trust us’ is not good enough. 
The revelations seem to vindicate many of the conclusions and rec- 
ommendations of a 2008 report by the US National Research Council 
(NRC) — Protecting Individual Privacy in the Struggle Against Terror- 
ists: A Framework for Program Assessment (go.nature.com/bsooux). 
That report addressed privacy issues raised by the Total Information 
Awareness programme, a research effort launched by the US Defense 
Advanced Research Projects Agency in 2002 to develop data mining 
and other technologies to link and search disparate databases, for exam- 
ple to try to identify suspicious patterns to detect and track terrorists. 
After much controversy, that programme had its funding removed 
by Congress in 2003. But as the NRC report noted, this was probably 
a pyrrhic victory for civil liberties. It removed a focused programme 
subject to congressional oversight and public debate that would deter- 
mine appropriate uses and safeguards. Instead, much the same work 
has continued in agencies across government, including the NSA, with 
less oversight. The report warned that this was “likely to result in little 
security and, ultimately, brittle privacy protection” How right it was. 


Privacy matters. Yet last week, many defenders of snooping on pri- 
vate individuals sought to play down its significance. Several, includ- 
ing UK foreign secretary William Hague, trotted out tired fallacies, 
including that people who have nothing to hide have nothing to fear. 
That has long been debunked by academics; the idea is based on a 
misconception of what privacy is about. 

Privacy is a human right, and is essential if people are to develop 
autonomy. It is central to freedom of expres- 


“Obama asked sion and association, and to preventing abuse 
Americans of personal information. There are numer- 
to trust the ous examples of misuse of private data by 
government, but agencies and law enforcement, including 
history shows intimidation, selective character assassina- 
that ‘trust us’ tion, repression of dissent and wrongful 
is not good arrest. Privacy is a cornerstone of a free and 
enough. ” creative society, and is an essential defence 


against unwarranted social control. 

Government officials in the United States and elsewhere should find 
the NRC report and read it carefully. It calls for “robust, independent 
oversight” of government data mining and surveillance to “mine the 
miners and track the trackers”. Some data could help security efforts, 
the report says, but it notes that many security experts have misgivings. 
They question the feasibility and reliability of data mining to look for 
and track terrorists in massive data sets, and they raise concerns about 
the risk of law-abiding individuals and companies being falsely targeted. 

Such surveillance is not unique to the United States. In April, a 
report by the United Nations’ Human Rights Council warned that 
many countries worldwide, including democracies, are increasingly 
allowing intelligence and law-enforcement agencies to deploy indis- 
criminate and extensive surveillance of communications. That weak- 
ens or removes safeguards such as justification of individual cases of 
surveillance, and oversight by a neutral judicial body. 

As the World View on page 139 shows, privacy and what it means 
in the digital age is an increasingly crucial question in the era of big 
data. A grown-up and open debate is needed, with trust on all sides. 
It has not started well. = 


Young upstarts 


Lucrative prizes emulating the Nobels bring 
welcome money and publicity for science. 


hen a theoretical physicist who has worked on quantum 
Wie and string theory calls attention to an “interesting 

experiment’, the experiment deserves notice. This is par- 
ticularly true when that experiment is an attempt to deliver a little 
Hollywood glamour to physics, with an Oscars-style ceremony and 
gigantic cash prizes. 

The US$3-million Fundamental Physics Prize is indeed an interest- 
ing experiment, as Alexander Polyakov said when he accepted this 
year’s award in March. And it is far from the only one of its type. Asa 
News Feature on page 152 discusses, a string of lucrative awards for 
researchers have joined the Nobel Prizes in recent years. Many, like the 
Fundamental Physics Prize, are funded from the telephone-number- 
sized bank accounts of Internet entrepreneurs. These benefactors have 
succeeded in their chosen fields, they say, and they want to use their 
wealth to draw attention to those who have succeeded in science. 

What’s not to like? Quite a lot, according to a handful of scientists 
quoted in the News Feature. You cannot buy class, as the old saying 
goes, and these upstart entrepreneurs cannot buy their prizes the 
prestige of the Nobels. The new awards are an exercise in self-pro- 
motion for those behind them, say scientists. They could distort the 
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meritocracy of peer-review-led research. They could cement the sta- 
tus quo of peer-reviewed research. They do not fund peer-reviewed 
research. They perpetuate the myth of the lone genius. 

The goals of the prize-givers seem as scattered as the criticism. Some 
want to shock, others to draw people into science, or to better reward 
those who have made their careers in research. Several want to show that 
leading scientists can attain the lifestyles of financiers and footballers. 

As Nature has pointed out before, there are some legitimate concerns 
about how science prizes — both new and old — are distributed. The 
Breakthrough Prize in Life Sciences, launched this year, takes an unrep- 
resentative view of what the life sciences include (see Nature 494, 402; 
2013). But the Nobel Foundation’s limit of three recipients per prize, 
each of whom must still be living, has long been outgrown by the col- 
laborative nature of modern research — as will be demonstrated by the 
inevitable row over who is ignored when it comes to acknowledging the 
discovery of the Higgs boson. The Nobels were, of course, themselves set 
up bya very rich individual who had decided what he wanted to do with 
his own money. Time, rather than intention, has given them legitimacy. 

As much as some scientists may grumble about the new awards, 
the financial doping that they bring to research and the wisdom of 
the goals behind them, two things seem clear. First, most researchers 
would accept such a prize if they were offered one. Second, it is surely 
a good thing that the money and attention come to science rather than 
go elsewhere. It is fair to criticize and question 
the mechanism — that is the culture of research, 
after all — butit is the prize-givers’ money to do 
with as they please. It is wise to accept such gifts 
with gratitude and grace. m 
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WORLD VIEW yernisicorsen 


ost people in the United States could soon know someone 
Wk genome is held in a research database. Concerns are 

growing about our ability to properly control access to that 
information. Also growing among some scientists is the feeling that 
restricting access to genomic data fetters research. How long will it be 
until an idealistic and technically literate researcher deliberately releases 
genome and trait information publicly in the name of open science? 

Both the open-access literature and the open-source software move- 
ments began with idealists. It seems inevitable that there will be a 
major leak of genome information in the near future. Individual sci- 
entists, institutions and funders should consider now how they will 
react when this happens. 

Some studies already gather the genetic data of more than 50,000 
individuals in a single analysis. Although this information is sup- 
posed to be highly protected, it is disseminated 
to various institutions that have inconsistent 
security and privacy standards. In practice, data 
protection often comes down to individual scien- 
tists. Once leaked, these data would be virtually 
impossible to contain. 

What harm would come from a leak of per- 
sonal and genomic data? The consent form for 
the Personal Genome Project (PGP) — which 
makes no attempt to keep genetic information 
secret — offers a guide. It lists a range of adverse 
consequences, from revealing non-paternity to 
being framed with synthesized DNA planted at 
a crime scene. 

Most research genome data are de-identified, 
but given progress in re-identification and commercial genetic data- 
bases, will they stay that way? De-anonymized genomic data would be 
most likely to reveal health conditions relevant to the study for which 
they were collected. The effects might be uncomfortable but would 
probably reveal less than a typical Google search history. So far, no 
PGP participant who released genomes and traits has experienced 
adverse consequences that have been reported to the Institutional 
Review Board. In the longer term, the risk of harm may rise as our 
understanding of genetic variation increases. 

Then there is the public outcry a genome breach might incite. The 
public often has an exaggerated perception of the links between genes 
and personal traits. Lacking contextual information, research partici- 
pants could wonder whether their own genomes had been leaked and 
dread implausibly dire consequences. 

Thus a genome leak might lead to a backlash. Volunteers might 
withdraw from research studies and refuse to 


join new ones. Research might even be subject NATURE.COM 
to moratoriums and prohibitive restrictions. _ Discuss this article 
The harm to genetic research could be great, and __ online at: 

study participants could be unsettled. go.nature.com/oybzqm 


THE QUESTION IS NOT 
HOW TO PREVENT 


ALEAK 
BUT HOW TO 
MITIGATE THE 


FALL-OUT. 


Be prepared for the big 
genome leak 


Itis only a matter of time until idealism sees the release of confidential 
genetic data on study participants, says Steven E. Brenner. 


What can be done? Two extreme options offer appealing simplicity. 
One is for research projects to incorporate unrestricted data release 
from the outset. This option should be offered more broadly owing to 
the certainty and research benefits it offers. However, would enough 
people be willing to share so openly? The second option would be to 
lock down genomes so tightly that they are virtually impossible to steal, 
for example by only allowing analyses on central computers through 
restricted interfaces. Although useful as an alternative, this system 
would stymie research were it to become the exclusive means of access 
to data, but it would still remain vulnerable to ingenious ways of elicit- 
ing inappropriate genetic information. 

Neither option is comprehensively workable, which means that the 
question is not how to prevent a leak but how to mitigate the fall- 
out. This requires some specific steps, as well as progress in adapting 
concepts already used elsewhere in biological 
research and in applying principles proposed by 
groups such as the Presidential Commission for 
the Study of Bioethical Issues in Washington DC. 

Funders should develop rapid mechanisms for 
notifying study participants, governments and the 
media when breaches occur and provide informed 
guidance about scope and probable consequences 
for those affected. This would require recontact- 
ing research participants to warn those whose data 
were leaked and, implicitly, to calm others whose 
data remain secure. More research is needed about 
the possible harm of such leaks to better inform 
and protect research participants before and after 
leaks occur. 

We should also take steps to minimize the frequency and extent of 
future genome leaks. Institutions could establish uniform protocols and 
reviews to ensure the safety of protected genomic data. All researchers 
using restricted genomic data should be trained regarding the ethics of 
and the technologies involved in protecting human data. Technical and 
legal strategies should be proactively deployed to help limit dissemina- 
tion of leaked data to those who furtively hunt for them. 

Augmented legal protections could reduce the harm from inappro- 
priate use of such data. In the meantime, we need to address a quan- 
dary: research with leaked data would undoubtedly speed immediate 
scientific progress, but should scientists exploit them? 

Most importantly, we must ensure that the necessary discussion 
about the risks of a genome leak is balanced with information about 
the tremendous benefits that collected genetic information has for 
all of us. Although the acceleration and promise of genomics makes 
aleak inevitable, it also guarantees medical progress. m SEE EDITORIALP.137 


Steven E. Brenner is a Professor at the University of California, 
Berkeley. 
e-mail: brenner@compbio.berkeley.edu 
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RESEARCH HIGHLIGHTS 


Birds’ mysterious 
missing penises 


The development of chicken 
penises is cut short by signals 
that prompt cell death, a 
finding that could help to 
explain why 97% of bird 
species have little or no 
phallus despite reproducing by 
internal fertilization. 

Researchers led by Martin 
Cohn at the University of 
Florida in Gainesville cut tiny 
windows into eggs to compare 
developing chickens, which 
lack phalluses, with ducks, 
whose penises can be half as 
long as their bodies. 

Chicken embryos began 
to form penises, but these 
shrank midway through 
development. The researchers 
pinned the cause on elevated 
levels of a protein called Bmp4, 
which promotes cell death, at 
the tip of the organ. 

The loss of penises may 
have been a by-product of the 
evolution of other features, 
such as beak shape, which 
are also influenced by Bmp 
proteins, the authors suggest. 
Curr. Biol. http://dx.doi. 
org/10.1016/j.cub.2013.04.062 
(2013) 

For a longer story on this research, 
see go.nature.com/1mgn9w 


Sea stars shed 
too-hot arms 


Sea stars may use their arms 
to keep their central cores 
cool when high ” 


Selections from the 
scientific literature 


CLIMATE SCIENCE 


Reindeer keep the ground cool 


Reindeer herding practices and their effect 
on vegetation in northern Scandinavia may 
influence when snow melts in spring. 

Tall, dense shrubs can hasten snow melt in 
the tundra. As more branches protrude over 
packed snow, less sunlight is reflected off the 
bright surface and more heat is absorbed by the 


ground. 


A team led by Juval Cohen at the Finnish 
Meteorological Institute in Helsinki used satellite 
observations to examine the cover of vegetation 


temperatures threaten their 
survival. 

Sylvain Pincebourde at the 
University of Tours, France, 
and his colleagues kept ochre 
sea stars (Pisaster ochraceus, 
pictured) under conditions 
that mimicked the sweltering 
temperatures to which the 
organisms can be exposed at 
low tides. Most sea stars died 
if their core temperatures 
exceeded 35 °C. Individuals 

that survived the heating 
.. generally had arms 
that were hotter 
than their cores, 
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and snow in northern Scandinavia. In Finland, 

where reindeer typically graze on the tundra 

throughout the year, snow melt begins later. 

In inland Norway, where the pastures are left 

ungrazed during summer and vegetation is taller 

and more abundant, the snow melts earlier. 
More intense reindeer grazing could delay 


snow melt and reduce ground heating during 


authors suggest. 


possibly because the creatures 
reroute body fluids into their 
central cavities to cool down. 
When their cores warmed, sea 
stars shed arms — consistently 
losing the hottest one first. 

J. Exp. Biol. 216, 2183-2191 
(2013) 


Immunity let 
loose 


Experimental therapies that 
unleash the immune system 
to fight cancer, by blocking 
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spring in the rapidly warming tundra, the 


Remote Sens. Environ. 135, 107-117 (2013) 


‘checkpoint inhibitors, 
continue to show promise in 
early clinical trials. 

Immune checkpoint 
inhibitors prevent 
autoimmunity, and can rein in 
the immune response against 
tumours. Antoni Ribas at the 
University of California, Los 
Angeles, and his colleagues 
tested lambrolizumab, a 
compound that blocks a 
checkpoint inhibitor called 
PD-1, in 135 people with 
advanced melanoma. Tumours 
shrank by at least 30% in 38% 
of the patients. 
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Ina separate study, Jedd 
Wolchok at Memorial Sloan- 
Kettering Cancer Center in 
New York and his colleagues, 
treated 86 people with 
advanced melanoma using 
two compounds: nivolumab, 
also a PD-1 inhibitor, and 
ipilimumab, an approved 
drug that blocks a checkpoint 
inhibitor called CTLA-4. 
Tumours shrank by at least 
half in 40% of the patients. 

N. Engl. J. Med. http://dx.doi. 
org/10.1056/NEJMoa1305133; 
http://dx.doi.org/10.1056/ 
NEJMoa1302369 (2013) 


Trap holds 
protoplanet dust 


Dust particles spotted around 
a young star support an idea 
about how planets are born. 
Planet formation is a 
paradox: according to 
standard theory, dust grains 
orbiting newborn stars should 
spiral into those stars rather 
than accrete to form planets. 
Astronomers have suggested 
that there are regions, or 
‘pressure bumps, where 
density and pressure gradients 
trap particles long enough to 
allow them to clump together. 
A team led by Nienke van 
der Marel at Leiden University 
in the Netherlands has 
observed such a trap around 
the star Oph IRS 48 located 
about 120 parsecs from 
Earth. The Atacama Large 
Millimeter/submillimeter 
Array in Chile detected a 
crescent-shaped cluster 
on one side of the star 
— probably a reservoir 
of coalescing dust grains 
(pictured as an artist's 
impression). 
Science 340, 1199-1202 (2013) 


Serendipity 
outstrips design 


Accelerated evolution of an 
artificial enzyme improved its 
activity several-thousand fold, 
owing to unexpectedly extreme 
remodelling of its active site. 

A team led by Donald 
Hilvert and Nenad Ban at 
the Swiss Federal Institute 
of Technology in Zurich 
optimized a computationally 
designed enzyme with several 
rounds of random mutagenesis 
and screening. The activity 
levels of the evolving enzyme 
eventually approached those 
of natural enzymes, but the 
protein no longer catalysed its 
reaction using the machinery 
the researchers had intended. 
An amino-acid residue 
installed to help rearrange 
bonds was abandoned for 
one that emerged at another 
location in the active site. Such 
swaps could be important in 
natural-enzyme evolution and 
design efforts, the authors say. 
Nature Chem. Biol. http://dx.doi. 
org/10.1038/nchembio.1276 
(2013) 


MOLECULAR BIOLOGY 


Boosting plant 
defence 


The discovery of a gene that 
regulates the effects of the plant 
hormone jasmonic acid might 
lead to ways to increase pest 
resistance in crops, without 
hindering their growth. 
Jasmonic acid helps plants to 
fend offinsects and pathogens; 
it also regulates aspects of 
plant development, including 
fertility and fruit ripening. 
Daoxin Xie of Tsinghua 
University in Beijing and his 
colleagues identified a gene 
called JAV1 in the model plant 
Arabidopsis thaliana that 
suppresses several responses 
triggered by jasmonic acid. The 
JAV 1 protein was degraded 
when insects or fungi attacked. 
Silencing JAV1 boosted plant 
resistance to disease, but had 
no adverse effect on fertility or 
other developmental processes. 
Mol. Cell 50, 506-517 (2013) 


RESEARCH HIGHLIGHTS 


THIS WEEK 


COMMUNITY 


CHOICE 


One polymer with multiple forms 


> HIGHLY REA 


on pubs.acs.org 


An unusual material can switch between 
- polymers from two different classes with 


in May : the addition of light. 

Da-Hui Qu, He Tian and their colleagues 
at the East China University of Science and Technology in 
Shanghai combined two types of molecules. Cyclodextrins 
form non-covalent complexes to yield supramolecular 
polymers, whereas coumarins form covalent bonds with each 
other under one wavelength of light, and release those bonds 
under another. The material that the researchers created could 
go from a supramolecular polymer to a covalent polymer 
and back again with the addition of light; adding a detergent 
produced a reversible hydrogel. Substances with switchable 
properties can combine advantages of distinct polymers in a 


single platform, the authors say. 
Langmuir 29, 5345-5350 (2013) 


PALAEONTOLOGY 


mammals 


A giant, plant-eating 
lizard successfully 
competed with mammals 
about 40 million to 
36 million years ago. 
Researchers led 
by Jason Head at 
the University of 
Nebraska-Lincoln 
identified the lizard in 
a diverse assemblage 
of fossils collected in 
Myanmar. The teeth and 
jaws of the creature revealed 
that it was a plant-eater, and 
at an estimated 27 kilograms, 
it was one of the largest 
animals in the area. The 
researchers dubbed the 
species — which was almost 
twice the length of any 
living herbivorous lizard — 
Barbaturex morrisoni after 
the singer Jim Morrison, 
who famously proclaimed 
himself the lizard king. 
Reptiles need external heat 
to keep their bodies warm, 
so the hotter temperatures 
of past climates could have 
allowed the large lizards to 
survive, the authors say. 
Proc. R. Soc. B 280, 20130665 
(2013) 


Big lizard among \ 


PLANT SCIENCES 


Tomatoes 
make tubers 


Boosting levels ofa 
hormone in tomato plants 
(Solanum lycopersicum) 
causes them to make tubers, 
like their sibling species the 
potato (Solanum tuberosum). 

Yuval Eshed at the Weizmann 
Institute of Science in Rehovot, 
Israel, Eliezer Lifschitz at the 
Israel Institute of Technology 
in Haifa and their team 
engineered tomatoes to have 
high levels ofa cytokinin, a 
type of hormone found in all 
plants. The tomatoes formed 
tiny tubers (pictured) at the 
base of leaves along their stems, 
where cells divide and levels of 
hormones fluctuate. 

Giving potato plants the 
hormone in culture also 
elicited small spuds along plant 
stems. The authors suggest that 
a simple, common mechanism 
might prompt tubers in other 
species. 

Curr. Biol. http://dx.doi. 
org/10.1016/j.cub.2013.04.061 
(2013) 
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SEVEN DAYS sescnsi 


POLICY 


Gun research 

The US Institute of Medicine 
(IOM) has recommended 

a broad firearms-research 
agenda for the Centers 

for Disease Control and 
Prevention (CDC) in Atlanta, 
Georgia. The 5 June IOM 
report poses questions on 
topics from the value of 
background checks to the 
influence of violent media 
and games on gun-related 
violence. The CDC requested 
the report after US President 
Barack Obama made an order 
in January for it to resume 
research on gun violence, 
which the agency had stopped 
in 1996 when Congress 
forbade it from using money 
to “advocate or promote” gun 
control. See go.nature.com/ 
rufroe for more. 


Brazil emissions 


Brazil’s greenhouse-gas 
emissions fell by nearly 39% 
between 2005 and 2010, 
according to an inventory 
released on 5 June by the 
country’s government. The 
sharp drop was entirely due to 
falling rates of deforestation. 
However, it was tempered 

by rising emissions from the 
agriculture and energy sectors 
—a concern if Brazil wants to 
cut emissions further. Overall, 
the country is on track to 
meet goals announced at the 
2009 United Nations climate 
summit in Copenhagen. See 
go.nature.com/xl]3ht for more. 


Watchdog backs off 
The US federal office that 
punishes breaches in human- 
research protections said on 

5 June that it would not issue 
sanctions over a controversial 
university study on how best 
to treat premature infants 
with oxygen. In March, the 
Office for Human Research 
Protections had said that 
investigators in the study, 


Eye in the sky 


The US Geological Survey (USGS) says that 
data from its latest environmental satellite, 
Landsat 8, are now publicly available (at 
go.nature.com/u81wkh; shown, an image 
over Northwest Arctic Borough in Alaska). 
The US$855-million spacecraft extends the 
world’s longest continuous Earth-observation 


which was overseen by 

the University of Alabama 

at Birmingham, did not 
sufficiently inform parents of 
the risks to infants in the trial. 
In its letter last week, the office 
said that the issues raised were 
complex enough to require new 
guidance for future studies. 


Grey-wolf revival 
Grey wolves (Canis lupus), 
once on the verge of 
extinction, have recovered 
sufficiently to be removed 
from the federal list of 
threatened and endangered 
species, the US Fish and 
Wildlife Service (FWS) 
proposed on 7 June. That 
would return management of 
the animals to state wildlife 
agencies. The FWS says that 
wolves are expanding their 
range and have exceeded 
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population targets by up to 
300%, but environmentalists 
fear that state hunting 
initiatives will inhibit that 
recovery. The FWS proposal 
would maintain federal 
protection for the Mexican 
wolf in the southwest. 


Montreal accord 
China agreed on 8 June that 

it will “work together” with 
the United States and other 
countries to use the Montreal 
Protocol to regulate the potent 
greenhouse gases known 

as hydrofluorocarbons. 
Introduced to replace the 
ozone-destroying compounds 
outlawed by the Montreal 
treaty, these refrigerants are 
currently managed under 

the United Nations climate 
framework because of their 
greenhouse effects. Many argue 
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project, which has documented global land-use 
trends through more than 3.7 million images 
dating back to 1972. Landsat 8 will collect at 
least 400 images per day at several visible and 
near-infrared frequencies, covering the planet 
every 16 days. The USGS has accommodated 
11 million downloads since 2008. 


that they would be phased out 
faster and more cheaply under 
the Montreal treaty — a view 
that China had opposed until 
the 8 June agreement. 


Accelerator shut 


Following a radiation leak, no 
experiments will take place at 
the Japan Proton Accelerator 
Research Complex (J-PARC), 
in Ibaraki prefecture, until 
early next year, the facility's 
director, Yujiro Ikeda, said on 
10 June. The leak occurred 

at the Hadron Experimental 
Facility on 23 May, when a 
proton beam damaged a gold 
target, releasing material that 
exposed 34 workers to low- 
dose radiation. A malfunction 
of the beam extraction 

unit was blamed, and an 


USGS 


¢ investigation is under way. 

= All experiments up to the end 

2 of July are cancelled. J-PARC 

& was already scheduled to close 

% from August this year until late 
January 2014 for maintenance. 


China in space 

China launched its fifth crewed 
space mission on 11 June. The 
Shenzhou 10 space capsule, 
carrying three astronauts, is 
scheduled to dock with the 
country’s orbiting Tiangong 1 
space module on 13 June, 

and aims to return to Earth 
on 26 June. It will be the last 
mission to Tiangong 1. China 
plans to launch two more 
modules before 2016, in the 
run-up to building a crewed 
space station by 2020 (see 
Nature 473, 14-15; 2011). 


Ee 
Max Planck chief 


Martin Stratmann (pictured), 
a chemist at the Max Planck 
Institute for Iron Research in 
Diisseldorf, Germany, was 
elected on 6 June as the next 
president of the Max Planck 
Society in Munich, Germany’s 
largest non-university 
basic-research organization. 
Stratmann, 59, will take office 
in June next year, replacing 
developmental biologist Peter 
Gruss, who has presided over 
the Max Planck Society since 
2002. The society runs more 
than 80 research institutes 

in Germany, with an overall 


TREND WATCH 


SOURCE: NATL OCEAN COUNCIL 


A government report on the 
status of the US oceanographic 
fleet says that fuel costs for 
research ships have increased 
by 400% since 2003, and that 
pressure on budgets has led to 
some US vessels being disposed 
of or laid up. As vessels are 
retired, the fleet is set to shrink 
rapidly unless new ships are 
built or old ones overhauled 
(see chart). Even new vessels 
currently being built will not 
stem the fleet’s decline if they 
enter service as scheduled. See 
go.nature.com/fvcs5z for more. 


budget this year of about 
€2 billion (US$2.6 billion). 
See go.nature.com/1jccau 
for more. 


| RESEARCH 
How to cut carbon 


With carbon emissions 
climbing by 1.4% to 

31.6 gigatonnes in 2012, 

the world is headed for a 
long-term temperature 

rise of 3.6-5.3 °C, said 

the International Energy 
Agency in a 10 June report. 
The agency, based in Paris, 
endorsed four cost-effective 
policies to help set the world 
on path to a 2°C rise. Its 
major recommendation is the 
adoption of energy-efficiency 
measures such as performance 
standards for lighting, heating 
and road vehicles. Other 
policies involve limiting 

the construction and use of 
inefficient coal-fired power 
plants; minimizing methane 
leaks from the oil and gas 
industry; and phasing out 
fossil-fuel subsidies. 


| __BUSINESS 
Science networking 


A professional networking 
site for researchers has 

raised US$35 million from 
investors including Microsoft 
co-founder Bill Gates. 
ResearchGate, headquartered 
in Berlin, was founded in 
2008 by two virologists and 
says it now has more than 

2.9 million members. It is one 
of a number of sites (including 
Academia.edu and Mendeley) 
that aim to be hubs for 
scientists to connect and share 
publications. ResearchGate 
announced the latest funding 
on 4 June, but declined to 
disclose investment raised in 
two previous funding rounds. 


Ups and downs 
Pharmaceutical firm 
AstraZeneca will pay 

US$560 million up front 

for Pearl Therapeutics in 
Redwood City, California, 
which is developing an 
inhaled treatment for lung 
conditions such as bronchitis 
and emphysema (collectively 
known as chronic obstructive 
pulmonary disease). London- 
based AstraZeneca announced 
the purchase (which could 
cost up to $1.15 billion if 
other milestones are met) 

on 10 June. Six days earlier, 

it said that it had pulled the 
plug on a once-promising 
rheumatoid arthritis drug, 
fostamatinib, after a series of 


US RESEARCH FLEET NEEDS RESUPPLY 


The US oceanographic fleet is on course to roughly 
halve by 2026 as ships are retired from service. 
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SEVEN DAYS | THIS WEEK | 


17 JUNE 

The European 
Commission, the 
European Parliament 
and the Council 

of the European 
Union negotiate the 
final details of their 
2014-20 research- 
funding programme, 
Horizon 2020. 


17-21 JUNE 

The latest research 

into detecting nuclear 
explosions is presented 
at a conference hosted 
by the Comprehensive 
Nuclear-Test-Ban Treaty 
Organization in Vienna. 
go.nature.com/xtgtso 


late-stage clinical-trial failures. 
The company returned 
licensing rights for the drug 

to biotechnology firm Rigel 

in South San Francisco, 
California, from whom it 
bought the rights for more 
than US$100 million in 2010 
(see Nature http://doi.org/ 
fcxh6c; 2010). 


Diabetes debate 
Advisers to the US Food 

and Drug Administration 

say that the agency should 
ease its restrictions on access 
to Avandia (rosiglitazone), 

a diabetes drug linked to 
increased heart risk. Three 
years ago, US regulators sharply 
curtailed access to the drug, 
which is made by London- 
based GlaxoSmithKline. 
European authorities pulled 
the drug from the market 
altogether (see Nature 467, 
505; 2010). But the advisory 
committee, meeting on 6 June, 
noted that a re-analysis ofa 
pivotal clinical trial suggested 
that people taking the drug 
were not more likely than 
others to die from heart 
complications. See go.nature. 
com/4zlswa for more. 


> NATURE.COM 
For daily news updates see: 
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China’s decision to reduce its carbon dioxide emissions should help to alleviate the dangerously high levels of air pollution in the country. 


CLIMATE CHANGE 


China gets tough on carbon 


Cap-and-trade pilot schemes set stage for nationwide roll-out. 


BY JANE QIU 


( ‘ath responsible for about one-quarter 
of the world’s carbon dioxide emissions, 
has ambitious goals to reduce them — 

but has been unwilling to set absolute targets 

for fear of slowing economic growth. There are 
now signs that its position is changing. 

On 18 June, the country will launch an 
emissions-trading scheme in the southern city 
of Shenzhen, marking its first attempt to cut 
emissions using market mechanisms. Under the 
scheme, more than 630 industrial and construc- 
tion companies will be given quotas for how 
much carbon dioxide they can emit. Companies 
that pollute more than they are allowed will have 
to buy credits from cleaner counterparts that 
reduce emissions below their quota — thereby 
creating a price for the greenhouse gas. 

Another six such cap-and-trade schemes will 


be rolled out by the end of the year in the cities 
of Beijing, Tianjin, Shanghai and Chongqing, 
and the provinces of Guangdong and Hubei. 
The trial will cover 864 million tonnes of carbon 
dioxide by 2015 — around 7% of China's total 
emissions and about the total amount emitted 
by Germany each year, according to a report by 
the London-based analyst firm Bloomberg New 
Energy Finance. These regional pilot schemes 
will set the stage for the nationwide carbon 
market that is scheduled to launch in 2016. 

China has committed to cutting its carbon 
intensity — carbon emissions per unit of gross 
domestic product — by 40-45% of 2005 levels 
by 2020, which allows for 


increases in emissions, NATURE.COM 
although ata slower rate.  Formore on China’s 
The initial emissions emissions and 
limits for the regional _ pollution, see: 
schemes will be set by _go.nature.com/clpsie 
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applying the carbon-intensity targets to the 
emissions of individual companies. In 2016, 
this system will be scaled up nationally, again 
in line with carbon- intensity targets. 

After 2020, this plan is likely to be replaced 
with an absolute cap that would require a 
decline in overall emissions covered under the 
scheme. Such a move will depend on the effec- 
tiveness of an array of planned energy policies, 
researchers say. “It’s not difficult from a tech- 
nical point of view,’ says Xiang Gao, a mem- 
ber of China’s climate-talks delegation and a 
researcher at China’s National Development 
and Reform Committee (NDRC), the powerful 
ministry responsible for planning the country’s 
economic and social development. “It’s a matter 
of political will — which, in turn, will depend 
on whether the top leadership can be convinced 
that such a move is best for the country’s econ- 
omy and social stability,’ Gao says. > 
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> — Researchers say that China has reasons 
beyond climate change to implement emission 
caps. In the past few years, rampant air pollu- 
tion has caused increased public resentment 
and social unrest across the country. “China 
may not have a choice any more,’ says Knut 
Alfsen, head of research at the Centre for Inter- 
national Climate and Environmental Research 
in Oslo. “It’s just much better to control total 
emissions.’ 

A commitment from China to cap emis- 
sions “would breathe new life into climate 
talks” adds Alfsen, who is also a member of the 
China Council for International Cooperation 
on Environment and Development, an inter- 
national think tank that works closely with 
China’s cabinet and the NDRC. At the next 
climate-change summit, in Paris in 2015, nearly 
200 countries will aim to reach a legally bind- 
ing global agreement on emissions cuts, which 
would take effect in 2020. Kelly Sims Gallagher, 
an expert on energy and environmental policy 
at Tufts University in Medford, Massachu- 
setts, says that an ambitious emissions cap 
from China “would send a strong political sig- 
nal to the world” and would make it easier to 


pass more aggressive climate legislation in the 
United States, where there is strong political 

resistance to national climate regulations. 
Most researchers contacted by Nature are 
only cautiously optimistic that China can cap 
its emissions. A carbon ceiling for China 
“depends in part on how successful the pilot 
schemes will be’, says Lei Ming, an environ- 
mental economist 


“The energy at Peking Univer- 
market in China _ sity in Beijing. “We 
is not entirely will have to cross the 
freeandhasalot river by feeling the 
of government stones,’ he says, citing 
interferenceand _ the famous one-liner 


by the late reformist 
leader Deng Xiaoping. 

One of the main challenges for the nation- 
wide cap-and-trade scheme will be establishing 
its credibility. Verifying emissions, for instance, 
will be difficult in such a large country, says 
Gallagher. David Yuetan Tang, board secretary 
of the Tianjin Climate Exchange, which is in 
charge of one of the seven pilot emission-trad- 
ing schemes, says that there is an institutional 
void about who will do this — and also a legal 


monopoly.” 


void about how companies will be punished for 
fraudulent claims or emissions excesses. “This 
is absolutely paramount, because emission 
quotas are money,’ he adds. 

Moreover, whether emissions trading can 
work under China's political system remains 
to be seen, critics say. “The energy market 
in China is not entirely free and has a lot of 
government interference and monopoly,’ says 
Qi Ye, an environmental-policy researcher at 
Tsinghua University and director of the Beijing 
office of the international think tank Climate 
Policy Initiative. The price of electricity, for 
instance, is heavily controlled, he says, which 
could seriously diminish the impact of impos- 
ing a carbon price on electricity producers. 

Emissions trading is just one of a series of 
energy and pollution policies due to be intro- 
duced in the next few years. For instance, Beijing 
is considering implementing a carbon tax to 
rein in pollution by sectors not covered by cap 
and trade, and continues to invest aggressively 
in renewable energy. It has also pledged to 
reduce the production and use of hydrofluoro- 
carbons, powerful greenhouse gases used in 
refrigeration and air conditioning. m 


PERSONALIZED MEDICINE 


‘Master protocol’ aims to 
revamp cancer trials 


Pilot project will bring drug companies together to test targeted lung -cancer therapies. 


BY HEIDI LEDFORD 


to the genetic underpinnings of disease, 

lung-cancer treatments have been at the 
frontier. But the 1.6 million people diagnosed 
with this cancer every year will take scant 
comfort in knowing that of the past 20 late- 
stage trials of drugs to treat it, only two yielded 
positive results. And in only one of those 20 
were patients chosen systematically by screen- 
ing for biomarkers such as relevant blood pro- 
teins or DNA sequences. 

Now, an ambitious project aims to improve 
those success rates and speed new treatments 
to market by matching companies with the 
patients whose tumours are most genetically rel- 
evant to the therapies they are trying to develop. 
The project is slated to launch next year and, if 
successful, could be expanded to other cancers. 

The project was spearheaded by the Friends 
of Cancer Research, a think tank and advocacy 
group in Washington DC, and has won the sup- 
port of the US National Cancer Institute and the 
US Food and Drug Administration (FDA). The 


[: the push to match medical therapies 
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idea is to streamline the drug-approval process 
by bringing pharmaceutical companies together 
to test multiple experimental drugs in late-stage 
clinical trials under a single, ‘master’ protocol. 
“The drive is to make the whole process of 
personalized medicine more efficient; says 
Eric Rubin, vice-president of oncology clinical 
research at Merck, a pharmaceutical firm based 
in Whitehouse Station, New Jersey. 


PLUG AND PLAY 

Launching a large, late-stage clinical trial typi- 
cally takes more than two years and requires 
some three dozen administrative and regu- 
latory approvals. To simplify this tangle, the 
master protocol will create an experimental 
plan to test several candidate drugs in hun- 
dreds of clinics across the United States. The 
initial protocol is expected to include up to six 
drugs; others may be added later, without the 
need for fresh protocol approval each time. 
“Tt’s like a Plug and Play,’ says David Gandara, 
an oncologist at the University of California, 
Davis, who is in charge of drafting the plan. “So 
you don't waste time over and over.” 
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Gandara has advocated this approach for 
the past decade, but the FDA and the pharma- 
ceutical industry voiced support only recently 
— swayed by a growing body of data revealing 
that cancers are, in effect, many rare diseases 
with different genetic roots (see Nature 455, 
148; 2008). A genetically targeted drug may 
work, but only in a fraction of cases. Such rare 
effects could easily be overlooked in a trial that 
contains a mix of patients whose cancers have 
heterogeneous causes, and the costs for drug 
companies to sort them all and run scores of 
separate trials are prohibitive. 

Under the master protocol, by contrast, 
patients will be screened for various biomark- 
ers and assigned to trials for drugs that are most 
likely to be effective. The approach does away 
with the need for patients to undergo multiple 
screenings: participating companies could enrol 
them from a large, central pool. It also eases 
pressure on the (often minute) tissue samples 
taken during lung biopsies, because many tests 
can be done at the same time, says Rubin. 

A similar model is already being tested in 
two smaller clinical trials for breast and lung 


KINGSTON GENERAL HOSPITAL 


FEATURES 


ANNE KATRIN PURKISS/REX 


cancers (see Nature 464, 1258; 2010). Both 
trials involve multiple biomarkers, drugs 
and clinics, and both won support from 
pharmaceutical companies. But that does 
not mean that drug companies will embrace 
a larger, more developed venture, says Roy 
Herbst, an oncologist at the Yale School 
of Medicine in New Haven, Connecticut, 
who chairs the steering committee of the 
master-protocol project. It is much easier 
to coax a company into a group effort for a 
small, early trial than to persuade it to give 
up any measure of control over a late-stage 
one crucial for gaining regulatory approval. 

Companies also prefer to maintain con- 
trol of proprietary information rather than 
deposit early results into centralized data- 
bases. “It's a challenge,’ says Herbst. “Many 
of them might think they can do it alone, 
and may worry about losing autonomy.’ 

The project's organizers tried to address 
industry concerns early on, says Ellen 
Sigal, founder and chairwoman of Friends 
of Cancer Research. At a planning meeting 
in March, representatives from more than 
20 drug companies were assured that the 
FDA supports the protocol and has statis- 
ticians working to help shape it — making 
the agency more likely to feel comfortable 
basing approval decisions on data from the 
trial. Organizers also pledged to have a neu- 
tral third party monitor the trial, to ensure 
that drugs made by competing companies 
would not be directly compared. 

Gandara hopes that the speed and lower 
costs will also draw industry partners. 
Late-stage clinical trials can cost between 
US$50 million and $100 million; Gandara 
estimates that the master protocol could cut 
that to $25 million or less. 

Companies might also be wooed by easy 
access to the National Cancer Institute's vast 
network of treatment centres and clinicians 
who are experienced in conducting clini- 
cal trials. That network will allow the trial 
to be conducted at 500 sites in the United 
States and Canada and enable it to enrol up 
to 1,000 patients a year. 

Thus far, the downside of participating 
seems minimal, says Richard Gaynor, head 
of oncology-product development at Eli 
Lilly, a pharmaceutical firm based in Indi- 
anapolis, Indiana. “It will be an interesting 
experiment,’ he says. m 
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GASTROENTEROLOGY 


FDA gets to grips 
with faeces 


Regulator triggers efforts to standardize faecal transplants. 


BY BETH MOLE 


he brown slurry is piped through tubes 
"Tine the top of the human body — or the 
bottom. It can even come in pill form. 
For years, doctors have been transferring 
faeces into ill people’s intestines to replace 
resident microbes with a fresh batch. The 
procedure is often a therapeutic success, but 
protocols for it vary wildly. As it steadily grows 
more popular, regulators are now working 
to define what a standard faecal transplant 
should be, and how to deliver one safely. 
During a public workshop last month at the 
US National Institutes of Health in Bethesda, 
Maryland, the Food and Drug Administration 
(FDA) reaffirmed that it has authority over 
faecal transplants. The agency had said this 


for years to researchers and companies who 
asked privately, but the workshop was the first 
public forum in which the FDA broadcast that 
it regulates faeces like a drug. 

Clinical trials of the procedures are not 
affected, because they were already subject to 
approvals from the agency. But US doctors per- 
forming faecal transplants as treatments must 
now submit an Investigative New Drug appli- 
cation to the FDA with details about their pro- 
tocols. (The agency then has 30 days in which 
it can intercede and stop an experiment.) Jay 
Slater, director of the division of bacterial, 
parasitic and allergenic products at the FDA 
in Silver Spring, Maryland, says that the move 
is a crucial way for the agency to make sure that 
protocols are safe. But he adds that the FDA 
wants to avoid being too prescriptive for > 
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Faecal transplants are an increasingly popular way to treat infections of Clostridium difficile, but approaches 


GUT INSTINCT 

vary wildly. 

Clinic Route Stool 
amount 

Mayo Clinic, Rochester, Colonoscopy | 50g 

Minnesota 

Nebraska Medical Center, | Nasal tube 30-50 g 

Omaha 

The Bright Medicine Clinic | Enema 50-300 g 

naturopathic practice, 

Portland, Oregon 

Kingston General Hospital, | Colonoscopy | 100 ml 

Canada (clinical trial) (synthetic) 

Thomas Louie’s private Capsules 0.47 ml 

practice, Calgary, Canada per pill 


> now, so that it can adopt the most effective, 
advanced protocols as they are developed. 

Although it may be years before the agency 
weighs in on which method is the safest, it has 
ignited a debate among researchers over how 
faeces should be screened, processed, delivered 
— or even synthesized. 

With faecal transplants, doctors aim to 
reestablish healthy microbe populations in 
the guts of patients. The procedure seems 
especially effective for people infected with 
Clostridium difficile, a diarrhoea-causing bac- 
terium that in the past two decades has become 
more prevalent and antibiotic-resistant in the 
United States, where it now kills an estimated 
14,000 people each year. A 2011 review of data 
from more than 300 patients concluded that 
faecal transplants can cure 92% of people with 
recurring C. difficile infections for which anti- 
biotics prove ineffective (E. Gough et al. Clin. 
Infect. Dis. 53, 994-1002; 2011). 

But there are many issues and unanswered 
questions. The method's success against C. dif- 
ficile has led to an “outrageous exuberance’, says 
Amee Manges, an epidemiologist at the Univer- 
sity of British Columbia in Vancouver, Canada, 
who led the review. Some doctors are using 
faecal transplants to treat other conditions, 
for which effectiveness is less established — or 
not established at all. Faeces, if not properly 
screened, can transmit disease. Furthermore, 
it is too early to know which of the many proto- 
cols is the most effective. “Everybody has their 
preferences,’ says Manges (see ‘Gut instinct’). 
Resourceful individuals can even get in on the 
act at home, by following step-by-step enema 
instructions from online videos. 

The wide variation in clinical practice starts 
at the very source of the ‘drug. Although evi- 
dence is lacking, some researchers suspect that 
the best stool comes from a patient’s blood rela- 
tives, who have genetic and environmental 
similarities with the patient that might influ- 
ence their gut microbes. Other doctors use 
anonymous donors. 

Preparation methods also differ. Some 
researchers freeze the stool for convenience, to 
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Stool Blending Patients | Claimed 

freshness | method treated | success 
rate 

<6hours | Labpaddle | ~40 90-95% 

old blender 

<6 hours Blender 17 94% 

old 

Frozen or Blender 8 88% 

<6 hours 

old 

Cultured Hand-mixed | ~30 Planned 
trial 

<6 hours Food mill 36 100% 

old 


use later. Others insist that it must be fresh — 
6 hours old or less — to ensure that the bacteria 
do not die or change their behaviour during 
their time outside the colon. Fresh or frozen, the 
stool is mixed with a liquid — usually saline, 
although some researchers have tried water or 
even milk. Others are exploring synthesizing 
faeces from scratch (see ‘How to make a stool). 

When it comes to the mode of delivery, some 


FAKE FAECES 
How to make a stool 


The runny, cloudy substance, developed 
by Elaine Petrof, an infectious-disease 
researcher at Queen’s University in 
Kingston, Canada, is one of the first 
prototypes of synthetic stool: a mixture 
of 33 microbes individually isolated from 
the faeces of a healthy donor and then 
recombined. 

Petrof chose the donor after a 
stringent screening protocol, selecting a 
woman who was infection- and parasite- 
free and clear of chronic diseases and 
drugs, and who had taken antibiotics 
only once in her life, long ago. 

For the microbe mix, Petrof chose a 
combination of beneficial bacteria and 
others known to support the overall 
microbial community, and threw the rest 
away. The chosen bacteria were grown 
in pure culture, then mixed together in 
saline solution at ratios that replicated 
their original proportions in the stool. 

In January, Petrof and her colleagues 
published results showing that the slurry 
of microbes, called Re-POOPulate, could 
be used in faecal transplants, curing two 
people of life-threatening Clostridium 
difficile infections (E. O. Petrof et al. 
Microbiome 1, 3; 2013). To confirm 
the results, the team plans to enrol 
30 people in a clinical trial. B.M. 
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researchers use enemas — easy to administer, 
but good for transplanting stool into only the 
lower end of the colon. Others use colono- 
scopies, which are more invasive but ensure 
that the stool makes it farther into the intestines. 

Johan Bakken, an infectious-disease con- 
sultant at the University of Minnesota Medical 
School in Duluth, who has used faecal trans- 
plants to treat 120-130 people with C. difficile 
infections, delivers the preparation in a tube 
threaded through a patient’s nose into the 
small intestine. This, he argues, may be safer 
for people with disease-weakened intestinal 
walls that could be torn during enemas or 
colonoscopies. Thomas Louie at the University 
of Calgary in Canada avoids tubes altogether: 
so far, he has treated 33 people by delivering 
stool microbes orally, wrapped in three layers 
of medical-grade gelatin capsules. 

Researchers generally agree that donors must 
be screened using standardized procedures if 
faecal-transplant therapies are to flourish. 
Many are concerned that inadequately checked 
material could contain pathogens, just as blood 
transfusions sometimes caused transmission of 
hepatitis C in the days before screening. Faecal 
screens tend to include tests for blood-borne 
pathogens such as HIV and hepatitis viruses, 
as well as intestinal pathogens and parasites. 
But some scientists have collected anecdotes of 
donors who were not even tested for obvious 
pathogens such as HIV and C. difficile. 

One of the most pressing questions is which 
diseases can be treated effectively with faecal 
transplants. In addition to C. difficile infec- 
tions, researchers have used the procedures to 
treat chronic problems such as Crohn's disease, 
inflammatory bowel disease and multiple scle- 
rosis, but in very small case studies. Some clin- 
ics are even recommending faecal transplants 
for obesity, Parkinson's disease or autism spec- 
trum disorder — although most doctors remain 
sceptical. More data and oversight are needed to 
enable researchers to learn which applications 
work and which do not, says Gary Wu, a gastro- 
enterologist at the University of Pennsylvania in 
Philadelphia. “Stool is a very complex mixture 
that we dont fully understand,’ he says. 

Wt expects that researchers will eventually 
move to synthetic stool, a potentially safer and 
more consistent concoction. “But right now, 
were not at that point,” he says. 

Some researchers expressed frustration 
with the FDA’ move to regulate faecal trans- 
plants. Infectious-disease specialist Trevor Van 
Schooneveld of the University of Nebraska 
Medical Center in Omaha, has performed 
about 20 such transplants since 2011, work- 
ing with gastroenterologists. But in the past 
few weeks, he has turned away three patients 
while he submits the required Investigative 
New Drug application. Van Schooneveld ques- 
tions whether the agency should preside over 
an organic, personal substance, rather than a 
drug. “How the FDA plans to regulate human 
faeces isa mystery to me,’ he says. m 
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Light flips transistor switc 


Photons emerge as competitors to electrons in computer circuits. 


BY DEVIN POWELL 


ransistors, the tiny switches that flip 

| on and off inside computer chips, have 

long been the domain of electricity. 

But scientists are beginning to develop chip 

components that run on light. Last week, 

in a remarkable achievement, a team led by 

researchers at the Massachusetts Institute of 

Technology (MIT) in Cambridge reported 

building a transistor that is switched by a sin- 
gle photon. 

Conventionally, photons are used only 
to deliver information, racing along fibre- 
optic cables with unparalleled speed. The 
first commercial silicon chip to include opti- 
cal elements, announced last December, did 
little to challenge the status quo. The on-board 
beams of light in the device, developed at IBM’s 
research centre in Yorktown Heights, New 
York, merely shuttle data between computer 
chips. 

Now, Wenlan Chen of MIT and her col- 
leagues have taught light some new tricks, 
using a cloud of chilled caesium atoms sus- 
pended between two mirrors. Their transis- 
tor is set to ‘on’ by default, allowing a beam of 
light to sail through the transparent caesium 
cloud unmolested. But sending in a single 
‘gate’ photon turns the switch off, thanks to 
an effect called electromagnetically induced 
transparency. The injected photon excites the 
caesium atoms, rendering them reflective to 
light trying to cross the cloud (see “Turn off 
the light’). One photon can thus block the pas- 
sage of about 400 other photons, says Chen, 
who presented the result on 7 June at a meet- 
ing of the American Physical Society's Division 
of Atomic, Molecular and Optical Physics in 
Quebec City, Canada. 

The ability to turn a strong signal on and off 
using a weak one fulfils a key requirement of 
an optical transistor. “Nothing even came close 
before,’ says physicist Atac Imamoglu of the 
Swiss Federal Institute of Technology Zurich, 
who called the experiment “a true break- 
through” In theory, the hundreds of photons, 
controlled by the triggering photon, could fan 
out and switch off hundreds of other transis- 
tors in an optical circuit. 

With its exotic clouds of atoms and bulky 
equipment, the proof-of-principle transistor 
is unlikely to become a component in every- 
day computers. But it could be a useful tool 
for studying how photons interact at the quan- 
tum level — potentially leading to a quantum 
transistor that flips, not a one or a zero as in 


TURN OFF THE LIGHT 


Cold caesium atoms 
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light beam 


classical computing, but a fuzzy bit of quantum 
information. 

A more practical optical transistor debuted 
in April 2012 at Purdue University in West 
Lafayette, Indiana, where electrical engineer 
Minghao Qi has made one that is compatible 
with the semiconductor industry’s existing 
manufacturing techniques’. “The advantage 
of our device is that 


wehaveitonasilicon “Making 
chip,’ says Qi. an optical 

In this case, the transistor that 
beam of light to be __ really satisfies 
switched on andoff the necessary 
entersandexitsalong eriteriais 
a channel, etched in very hard.” 


the silicon, that sits 

next to a parallel channel. In between the two 
rails is an etched ring. When a weaker light 
beam courses through the second optical line, 
the ring heats up and swells, interfering with 
the main beam and switching off the transistor. 
This switch can flip on and off up to 10 billion 
times per second. 

And the output beam can fan out and drive 
two other transistors, meeting one of the estab- 
lished requirements’ for an optical transistor 
set out in 2010 by David Miller, a physicist at 
Stanford University in California. Other cri- 
teria include matching the frequency of the 
exiting signal to the input frequency and keep- 
ing the output clean, with no degradation that 
could cause errors. “Making an optical transis- 
tor that really satisfies the necessary criteria is 
very hard,” says Miller. 

Still, Qi does not expect to challenge the 
electronic transistor with his optical analogue, 
which consumes a lot more power and runs 

much more slowly. “We 
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Researchers have succeeded in using a single photon to switch off 
a beam of light, a key step in demonstrating an optical transistor. 


Switching photon 
excites atoms 


Light beam 
blocked 


markets, such as equipment for scrambling 
cable channels and military technologies that 
could benefit from light’s imperviousness to an 
electromagnetic attack. 

Routers that guide information through 
the Internet could also be amenable to opti- 
cal transistors and switches. At present, these 
stopping points in the network convert opti- 
cal signals travelling through fibre-optic cables 
into electrical signals; these are then processed, 
converted back to light and sent on their way. 
A router in which one beam of light pushes 
another in the appropriate direction — with 
no conversions involved — could in principle 
be faster and consume less energy. 

A popular candidate for such switches are 
quantum dots, small semiconductor crystals 
that behave like atoms. In one particularly 
sensitive quantum-dot switch, a beam of light 
is first guided along a material dotted with 
holes, called a photonic crystal. The light can 
pass through a quantum dot placed in its path 
without changing course. But ifa pulse of light 
is sent in just ahead of that beam, it can induce 
an interaction between the dot and the crystal 
that scatters the beam and sends it on a differ- 
ent path. 

Reported in May 2012 by Edo Waks of the 
Joint Quantum Institute at the University of 
Maryland in College Park and colleagues’, it 
switches when struck by a pulse of 140 pho- 
tons. In principle, that is a small enough 
amount of energy to rival conventional routers. 

But the switch still faces a practical obsta- 
cle common to all of these emerging optical 
technologies. The lasers that supply the devices 
with light consume considerable energy, offset- 
ting any savings. “Right now,’ says Waks, “the 
overhead is what's killing us.” m 
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Speed test for wild cheetahs 


State-of-the-art collar reveals animal’s quick reflexes and phenomenal acceleration. 


BY MATT KAPLAN 


r | Ahe cheetah crouches in the under- 
growth. When a young antelope strays 
a little too far from the herd, the cheetah 
explodes out of the bush — and, with a burst 
of speed unrivalled in the natural world, brings 
down its next meal. 

Orso we have assumed. But the first study to 
collect data on the animal’s movements in the 
wild reveals that, contrary to popular opinion, 
a cheetah’s sheer speed is not its only weapon 
when it comes to hunting. Its success as a pred- 
ator also hinges on its lightning reflexes and its 
ability to accelerate faster than a Ferrari. 

Determining just how fast animals run is 
no easy task. In zoos, captive cheetahs lured to 
run ina straight line can attain speeds of up to 
29 metres per second — nearly 105 kilometres 
per hour (N. C. C. Sharp J. Zool. 241, 493- 
494; 1997), more than double the top speed 
achieved by a human sprinter. But nobody had 
been able to determine whether the animals 
actually reach these speeds in the wild. 

Armed with lightweight solar-powered col- 
lars fitted with Global Positioning System and 
inertial-measurement technologies, a team led 
by Alan Wilson at London's Royal Veterinary 
College has precisely tracked wild cheetahs 
during hunting (see page 185). 

The team first tested the accuracy of their 
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collars on a dog by letting the animal loose on 
a beach; that way, the information collected 
could be cross-referenced with the paw prints 
left in the sand. The collars proved to be aston- 
ishingly precise, down to a level of 0.2 metres. 

The researchers then travelled to the Oka- 
vango Delta region of northern Botswana, 
where they used darts to sedate and collar 
five cheetahs. The 17 months spent collect- 
ing the data was the easy bit; what posed the 
challenge was making sense of the data, which 
confounded expectations. 

Out of a total of 367 runs, the fastest speeds 
achieved by the five individuals were 25.9, 25.4, 
22.0, 21.1 and 20.1 metres per second — all 
well short of the record set by captive cheetahs. 
Moreover, most hunts involved only moder- 
ate speeds, with the average top speed com- 
ing in at 14.9 metres per second. But although 
the wild cheetahs did not run as fast as their 
captive kin, they demonstrated other athletic 
abilities that researchers had not been able to 
measure before. 

The data revealed the cheetahs acceleration 
powers to be up to 120 watts per kilogram — 

about double the power 
of the swiftest grey- 
For video footage 


hounds and more than 
of acheetah giving four times that of Usain 
chase, see: Bolt during his record- 


breaking 100-metre 
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sprint in 2009. The cheetahs were also able to 
slow down rapidly, absorbing energy at a rate 
up to three times greater than that known for 
the best-performing polo horses — animals 
that are bred to be agile. 

When the researchers combined this infor- 
mation with field observations of cheetah kills 
and terrain information provided by Google 
Earth, they realized that the cheetahs were 
often hunting successfully amid thick vegeta- 
tion by making sharp turns and sudden stops. 
“We have always thought of cheetahs as sprint- 
ers, but now it looks as though sprinting is only 
part of the story,’ says Wilson. 

“Tt is remarkable,” says evolutionary biologist 
David Carrier at the University of Utah. “Both 
agility and manoeuvrability turn out to be at 
least as important to these animals as speed” 

Anticipation of what Wilson's collars might 
reveal in the future is growing fast. “I really 
wonder if cheetahs living on the open savan- 
nahs will yield the same sorts of results,” says 
Jack Grisham, coordinator of the cheetah spe- 
cies survival plan for the Association of Zoos 
and Aquariums. 

Carrier, meanwhile, hopes the collars will 
soon be used to study the movements of other 
animals in the wild. “Simultaneous record- 
ings from each member of a pride of lions or 
a pack of wild dogs would prove fascinating,” 
he says. = 
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UK scientists fear 
further cuts 


Funding jitters rife ahead of government spending review. 


BY DANIEL CRESSEY 


ith anxiety rising about what the 
immediate future may hold for 
Britain’s science funding, the man 


responsible for the nation’s finances is trying 
to allay researchers’ fears. 

Science “is a personal priority for me’, 
chancellor of the exchequer George Osborne 
told reporters on 6 June after a ceremony to 
mark the completion of the roof of the new 
£650-million (US$1.1-billion) Francis Crick 
Institute under construction in London. 

On 26 June, Osborne is set to unveil the next 
comprehensive spending review (CSR), which 
sets spending for government departments. He 
said that he hoped to make clear the govern- 
ment’s “long-term commitment” to research 
in the new review, but scientists fear another 
budget freeze. Asked if he could cut science 
after his supportive statements, the chancellor 
said that he would not pre-empt the CSR but 
added: “You can read between the lines that 
I'm going to do everything I can to make sure 
Britain has a bright scientific future.” 

The previous spending review, in 2010, set 
budgets for government departments for the 
four financial years to 2014-15. June's CSR will 
apply to just the 2015-16 fiscal year — because 
a new budget will be crafted after a general 
election in May 2015 — and Osborne has made 
it clear he wants cuts to most departments. 

Despite its short duration, this CSR is impor- 
tant, says Kieron Flanagan, a science-policy 
researcher at Manchester Business School. “You 
can do damage in one year” if spending is cut 
back severely, and whoever wins the election in 
2015 would be likely to work from the existing 
framework, he says. 

Analysts are especially keen to know what 
the government will do with the ‘ring fence’ 
that was placed around the science budget 
in 2010, freezing it at £4.6 billion a year. The 
fence spared core spending areas — such 
as grants that are awarded by the country’s 
research councils — from the cuts inflicted 
on other public sectors, although the science 
budget still lost money in real terms each 
year. The umbrella group Universities UK has 
calculated that, when inflation is taken into 
account, the deficit is £600 million over the 
current four-year CSR period. 

And, in any event, the ring fence had holes. 
The 2010 CSR moved capital spending in 


science — monies allotted to large infrastruc- 
ture projects such as buildings and facilities 
— outside the ring fence, away from the core 
science budget. That made infrastructure vul- 
nerable to cuts, and projects such as the United 
Kingdom Infrared Telescope in Hawaii face 
closure as a result (see Nature 486, 168; 2012). 
Many policy analysts expect the ring fence 
around science funding to be retained in 
the new CSR. But some worry that it may 
be removed or that 


“The absolutely additional categories 
crucial thing of science money 
is we fund could be moved out- 
basic scientific side it. 

research — One rumour in 
including in circulation is that the 


Medical Research 
Council (MRC), 
which is a major funder of UK medical 
research, will be moved from the Department 
for Business, Innovation and Skills — the 
department in charge of the science budget — 
to the Department of Health, where it might 
be more vulnerable to cuts or to a change in 
research focus. In a 6 June statement, Ted 
Bianco, acting director of the biomedical- 
funding charity the Wellcome Trust, called 
the prospect “ill-advised and potentially dam- 
aging’, adding that it would shift the balance 
“from fundamental to applied research when 
both are essential to medical progress”. 

Osborne would not comment during the 
Crick Institute event on a move for the MRC, 
but said that “the absolutely crucial thing is 
we fund basic scientific research — including 
basic scientific research in medicine — and 
I’m not prepared to do anything that puts that 
at risk”. 

James Wilsdon, a science-policy researcher 
at the University of Sussex, says that another 
year of flat cash for science would be “painful 
but survivable”. Deeper cuts, he says, would be 
another matter. m 


medicine.” 


CORRECTION 

The News Feature ‘The gun fighter’ (Nature 
496, 412-415; 2013) wrongly implied 
that blogger David Codrea had ‘outed’ gun 
researcher Garen Wintemute. Wintemute 
had in fact publicized his own work before 
Codrea’s 2007 blog post. 
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The launch of several science mega- 
prizes is making some researchers 
millionaires — but others question 
whether such awards are the best 

way to promote their field. 


s reactions to winning a multimil- 

lion-dollar prize go, Alexander 

Polyakov’s words were less than 

gushing. It was the culmination 
of a glittering award ceremony in Geneva, 
Switzerland, in March, hosted by Hollywood 
actor Morgan Freeman and featuring an 
operatic interlude from British singer Sarah 
Brightman. After a hushed pause, the physicist 
from Princeton University, New Jersey, was 
revealed as the winner of the 2013 Fundamen- 
tal Physics Prize and an accompanying pay- 
ment of US$3 million. “This new prize is an 
interesting experiment, a flustered Polyakov 
said moments later, after walking off stage 
clutching his sculptured silver trophy. “Such 
big prizes could become very influential and 
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they can have a positive impact, or they can be 
very dangerous.” 

Polyakov’s ambivalence echoes the senti- 
ments of many scientists towards the rash of 
big-money science prizes that have emerged 
over the past year. Sponsored by billionaire 
entrepreneurs, including Russian Internet 
mogul Yuri Milner, Facebook supremo Mark 
Zuckerberg, Google co-founder Sergey Brin 
and property developer Samuel Yin, the new 
awards outstrip the $1.2-million Nobel prizes 
in monetary value. In addition to Milner’s Fun- 
damental Physics Prize, the Internet billionaires 
have together created the Breakthrough Prize in 
Life Sciences, and Yin has introduced the Tang 
Prize as an Asian complement to the Nobels. 

The founders of these ‘new Nobels’ hope that 
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the winners will act as role models, inspiring 
future generations to pursue science, and that 
they will attract status and funding to the entire 
discipline. “We wanted to choose an amount 
that would be shocking,” says Anne Wojcicki, a 
biotechnology analyst and Brin’s wife, who sits 
on the board of the Breakthrough Prize. “We 
wanted to create science superheroes.” 

But the lavishness and ambition of the prizes 
have sparked criticism. “I don’t want to run 
these awards down, but I find it offensive that 
people are trying to either borrow the prestige 
of the Nobel, or buy it,” says Frank Wilczek, 
a physicist at the Massachusetts Institute of 
Technology in Cambridge, who won a share 
of the Nobel Prize in Physics in 2004. “The 
suspicion is that these provide more benefit 
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to the egos of the founders than to science,’ 
adds Jack Stilgoe, a lecturer in science policy 
at University College London. 

And although they support the goals of the 
prizes, critics say that the strategy for achiev- 
ing them is at best misguided, and at worst, 
could backfire. By bestowing riches on a few 
individuals, they say, the prizes could funnel 
money and attention towards people and fields 
that are already prestigious and well funded or, 
in some cases, could reward weak scientists or 
untested ideas. “Prizes are a good thing, but the 
question is, if your goal is to help science, are 
large prizes the most efficient way to 
do that?” asks Wilczek. 


TOWARDS THE NEXT GENERATION 

First awarded in 1901, the Nobel 
prizes have become established as the 
benchmark of excellence in the sci- 
ences. Since then, other awards have 
sprung up and gained prestige within 
specific disciplines. Some, such as the 
Fields Medal and the Abel Prize for 
mathematics, were designed to reward 
achievements in disciplines that are 
not covered by the Nobels. Others, 
such as the Lasker Awards for medi- 
cal sciences, have gained a reputation 
for predicting future Nobel winners. 

The Fundamental Physics Prize was 
the first of the latest breed of awards 
(see ‘Follow the money’). It burst on 
the scene in July 2012, when Milner 
announced that he had given nine 
awards of $3 million each to promi- 
nent theoretical physicists, and that 
he planned to sponsor one additional award 
each year. (Polyakov was the first single-award 
winner.) Milner, who himself pursued gradu- 
ate studies in theoretical physics, says that he 
wants to show that foundational research can 
be as financially rewarding as careers in sports, 
entertainment or business; indeed, he chose the 
size of the prize to mirror the type of annual 
earnings seen in the financial world. “The best 
minds should make at least as much as any 
trader on Wall Street,’ Milner says. 

In January, Yin launched the Tang Prize, 
four awards of 40 million Taiwanese dollars 
(US$1.3 million) for the winners, plus a grant 
of 10 million Taiwanese dollars each for their 
research. The Tang Prize will be awarded every 
two years from 2014 onwards and will recognize 
advances in sustainable development, biophar- 
maceutical science, Chinese studies and law. 
“For the past 100 years, it is mainly Western 
countries and Western research institutions that 
have fostered talented researchers,’ Yin says. 
“Now, as Asian economies are developing well, 
we should also shoulder 
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announced in February, also originates with 
Milner — but this time, he brought in friends 
and colleagues including Zuckerberg, Wojcicki 
and Brin. They split $33 million equally between 
each of 11 laureates, and have committed to 
giving five new prizes of $3 million each year. 
“We all have a background in science, though 
we werent all the best students,” says Wojcicki. 
“This is a way for us to reconnect with that and 
to give something back.” March saw the inaugu- 
ral award of the United Kingdom's £1-million 
(US$1.5-million) Queen Elizabeth Prize for 
Engineering, which is supported by charitable 


Alexander Polyakov receives the 2013 Fundamental Physics Prize 
from actor Morgan Freeman at a ceremony modelled on the Oscars. 


donations from corporate sponsors and was set 
up by the Royal Academy of Engineering explic- 
itly to give engineers a taste of the glamour and 
recognition that comes with the Nobels. 

After the initial surprise at the big sums 
involved, the first question in most people's 
minds was how the winners would spend their 
cash. “I really admire these billionaires for want- 
ing to give back to science — but I do hope some 
of these large amounts goes into research; says 
geneticist and entrepreneur Craig Venter, of the 
J. Craig Venter Institute in San Diego, Califor- 
nia. “It’s not so great if the winners just go and 
buy a bigger house with their prize money.’ 

Many of the winners seem a little sheepish 
about their windfall; those who are willing to 
be interviewed tend to mumble that they have 
not yet decided what proportion to keep and 
what to give to research. One person who has 
decided is Tejinder Virdee, a particle physicist 
at Imperial College London, who in December 
2012 shared a $3 million ‘special’ Fundamen- 
tal Physics Prize with six other leaders of the 
hunt for the Higgs boson at the Large Hadron 
Collider (LHC) at CERN, Europe's particle- 
physics lab near Geneva. (The special prizes 
were in addition to the nine prizes Milner had 
previously announced.) Virdee plans to use 
his winnings to pay for science equipment in 
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schools in sub-Saharan Africa and to support 
an exchange programme to bring teachers 
from these schools to visit the LHC. “I wanted 
to find the way to get the most leverage out of 
the money,’ he says. “It costs relatively little to 
train a teacher, but in turn they could reach 500 
students in the next few years.” 

Even if winners invest in their work, some 
researchers worry that the prize money will 
largely reward those who already have ample 
funding and recognition. Although the physics 
prizewinners do not include any Nobel laure- 
ates, between them they have won pretty much 
every other major award, including the 
Wolf Prize in Physics, the Fields Medal 
for mathematics and the MacArthur 
‘genius grant. “These are not under- 
funded or unappreciated people,” 
says Peter Woit, a mathematician at 
Columbia University in New York. 
Many of the Breakthrough Prize win- 
ners are already regarded as shoo- 
ins for future Nobels; one, Shinya 
Yamanaka, a stem-cell biologist at 
Kyoto University in Japan, already 
wona share of a Nobel last year. 

This means that the prizes could 
end up increasing the divide between 
the scientific haves and have-nots. 
“There's a huge disparity between the 
money that the big-wigs are getting 
and that going to other scientists,” 
says Bob O'Hara, a biostatistician at 
the Biodiversity and Climate Research 
Centre in Frankfurt, Germany. 
“US$33 million could fund my whole 
institute for three to four years.” 

O’Hara and other researchers also complain 
that the Breakthrough Prize recognizes the same 
popular fields as the Nobels. “One frustration 
for biologists is that the Nobel prize is focused 
on physiology and medicine and so it neglects 
other areas of the life sciences, which are just 
as important,’ O’Hara says. The Breakthrough 
Prize does little to redress the balance, overlook- 
ing areas such as ecology and evolutionary biol- 
ogy in favour of research into molecular biology 
and disease. What is more, O'Hara says, “there 
was an emphasis on diseases of the rich, in the 
West, at the expense of diseases that are preva- 
lent in the developing world” 

Wojcicki notes that there is a catch-22: 
awards that recognize the most extraordinary 
scientists will reflect existing trends in fund- 
ing, because popular areas will provide a big- 
ger pool of candidates from which to choose. 
But, she says, “I do agree that we should use the 
awards to drive change.” 


EXPENSIVE GAMBLE 

Although the Breakthrough Prize has been 
censured for playing it safe, critics are arguing 
that the Fundamental Physics Prize is making 
choices that are too risky. Woit notes that five 
out of the nine initial physics recipients — four 
of whom are based at the Institute of Advanced 
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Study in Princeton, New Jersey — study string 
theory, the idea that elementary particles are 
composed of vibrating loops of energy. A vocal 
critic of string theory, Woit has long argued 
that this research area has attracted a dispro- 
portionate amount of funding, despite lying 
beyond the range of direct experimental test- 
ability. “The obvious danger is that you may 
be giving a large award to someone for a com- 
pletely wrong idea,” he says. 

Polyakov, whose own award was partly for 
contributing mathematical techniques to 
string theory, sees this willingness to gam- 
ble on ideas as the new prize’s niche. “For 
me, ideas have their own reality,’ he says. 
But because in future the Fundamental 
Physics and Breakthrough prizes will be 
awarded by committees made up of the 
previous laureates, critics fear that current 
biases will be reinforced. Woit points to this 
year’s selection of Polyakov. “When string 
theorists at the Institute of Advanced Study 
give their first award to a colleague who works 
on the same stuff as them, then it is a serious 
problem,” he says. 

Milner counters that next year the winners 
of the ‘special’ prizes — the seven Higgs hunt- 
ers and physicist Stephen Hawking — will shift 
the balance on the judging panel. “It is a self- 
correcting loop,” he says. 


PUTTING ON THE RITZ 

The physics prize’s black-tie ceremony in 
Geneva, consciously modelled on the Oscars, 
highlighted the ambition of its founders to 
inspire current and future scientists. “I don’t 
see why millions shouldn't ultimately watch 
this ceremony,’ says Milner. Scientists in the 
audience were both entertained and bemused; 
one described it as “lots of fun’, another as 
“excruciatingly long”. 

The question is, will the money and razzle- 
dazzle have any real impact? Stilgoe challenges 
Milner’s claim that the awards will encourage 
early career scientists to stay the course, rather 
than — like Milner, Zuckerberg and Brin — 
leaving for more lucrative pastures. “The idea 
that anyone would make a career choice based 
on the minuscule chance of winning, say, a 
Nobel, is ridiculous,” he says. “Scientists, on 
the whole, are not in it for the money — and I 
am not sure we should want them to be.” 

Fred Cooper, a physicist and a visiting 
scholar at Harvard University in Cambridge, 
Massachusetts, also questions whether the 
awards will really speak to the public. “Visit 
YouTube and you'll see that the public already 
turns to science celebrities — Michio Kaku, 
Brian Greene and Sean Carroll — to learn 
about physics, not to the winners of the Fun- 
damental Physics Prize,’ he says. “If outreach 
is your aim, then give money to those that are 
already great communicators.” 

Even if the awards do inspire young people, 
Stilgoe argues that they send out the wrong 
message. “The prizes reinforce the mythology 
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FOLLOW THE MONEY 


A crop of new science prizes (pink) offers 
winnings greater than the Nobels. 


Name | Year introduced 


of science in which lone geniuses come up with 
brilliant ideas on their own, he says. 

And some say that the prize money would 
be better used to drive research directly. In 
2011, for example, Venter joined forces with 
the X Prize Foundation and health-care firm 
Medco Health Solutions, based in Northamp- 
ton, UK, to offer a US$10-million prize to the 
first team to sequence accurately the genomes 
of 100 centenarians. “I’m always more excited 
by awards that push or drive innovation, rather 
than ones that just recognize past achieve- 
ments,’ Venter says. 

Many researchers favour the idea of target- 
ing awards at promising scientists early in their 
careers. “A small award at this stage is a fantas- 
tic idea,’ says Cooper. At this point, scientists 


Breakthrough Prize in Life Sciences (2013) 
_ Fundamental Physics Prize (2012) 
US$3,000,000 


- Tang Prize (2013) 
$1,675,000 


Queen Elizabeth Prize for Engineering (2013) 
$1,500,000 


- Nobel prize (1901) 
$1,200,000 


Blavatnik Award for Young Scientists (2013) 
Lasker Award (1946) 
$250,000 


Fields Medal (1936) 
$14,700 


for something they did 30 or 40 years prior to 
that,” he recalls. “I thought, in terms of impact 
on the world, it would be good to award young 
people and create something that would allow 
them to thrive” One of the first winners of the 
regional Blavatnik Awards, Ruslan Medzhi- 
tov, an immunobiologist at the Yale School of 
Medicine in New Haven, Connecticut, says that 
the honour enabled him to attract funding and 
more prizes, including a share of the $1-million 
Shaw Prize in Life Science and Medicine. 

As for the new breed of mega-prizes, even 
some of the critics acknowledge — with a 
laugh — that they would accept one if it were 
offered to them. And Hans Clevers, a molecu- 
lar geneticist at the Hubrecht Institute in Utre- 
cht in the Netherlands and one of the inaugural 


“I find it offensive that people are trying to either 
borrow the prestige of the Nobel, or buy it.” 


are in a vulnerable position, struggling to win 
grants and often supporting young families. “Tt 
will just free scientists up to do more research 
— it’s about getting the biggest bang for your 
buck,’ Cooper says. 

This month, billionaire investor Len Blavatnik 
launched an award with prizes of $250,000 for 
young scientists. The scheme, which builds on 
a previous, regional version that has been run- 
ning since 2007, will be administered by the 
New York Academy of Sciences. (Nature Pub- 
lishing Group, Wilczek and Venter are among 
the advisers for the award.) Blavatnik says 
that he saw a gap in the prizes market when he 
attended a Nobel ceremony several years ago. 
“What struck me was that most of the laure- 
ates were quite old and they received the award 
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Breakthrough Prize winners, says that the new 
Nobels could rival the prestige of the old ones 
in 30 years or so, if they can consistently iden- 
tify high-calibre winners. 

The organizers of the Nobel prizes, however, 
remain unruffled by the upstarts. “For us, the 
important issue is to continue the good his- 
tory and track record that we have,” says Lars 
Heikensten, executive director of the Nobel 
prizes, based in Stockholm. “If we fail, it will be 
because we fail to maintain that level of respect, 
not because other prizes are acting as rivals to 
us. We've been in this business for 110 years 
and we plan to be in it forever.” m SEE EDITORIAL P.138 


Zeeya Merali is a science writer based in 
London. 
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FLY, AND BRING ME DATA 


Unmanned aerial vehicles are poised to take 
off as popular tools for scientific research. 


BY EMMA MARRIS 


he Tempest — wingspan 3.2 metres, cruising speed 75 knots 
— was designed to fly into severe storms. But during a test run 
in March for a new project, it is soaring through the bluest of 
skies. On the ground below, PhD student Maciej Stachura of the 
University of Colorado (UC), Boulder, is tapping on a tablet computer, 
transferring control to the aircraft’s own computer after a manual take- 
off. Systems engineer James Mack keeps his hands loose around a con- 
troller in case a problem arises, while Neeti Wagle, another PhD student, 
scans the skies to make sure the Tempest does not collide with anything. 

The plane's job today is to locate a beacon sending out a simulated 
distress signal. As it circles overhead, the Tempest'’s gas-powered engine 
makes the distinctive lawnmower-like noise that calls to mind the infor- 
mal name often given to such aircraft: drones. Unmanned aerial vehicle 
(UAV) is more commonly used in scientific circles. 

The UC Boulder team watches and listens as the 40 minutes or so of 
flight time tick by and the Tempest becomes a distant speck in the bright 
sky. Then a note of concern enters Stachura’s voice. “It is not doing a 
great job. It should be getting closer to us at this point,” he says. Finally, 
the drone turns and heads back towards the beacon. “Oh, there it goes,” 
says Stachura, clearly relieved. 

The use of drones in science has taken a similarly roundabout route. 
NASA first experimented with custom-built UAVs in high-altitude 
research during the 1970s, but unmanned planes have been slow to 
catch on. Drones with top-notch sensors were too expensive to tempt 
researchers and cheap versions could not offer much of value. During 
the past decade, however, lower prices and technical advances — from 
on-board navigation using the Global Positioning System (GPS) to 
miniaturization of autopilots — have lured many scientific groups to 
experiment with UAVs. 

Already, they offer an efficient way to gather data and are making 
important advances in polar research, volcano studies and wildlife 
biology. “They are on their way to becoming this indispensable and 
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revolutionary technology,’ says Adam Watts, an ecologist at the Univer- 
sity of Florida in Gainesville who has flown drones for years. 

But technical and legal hurdles stand in the way of their wider use. 
Researchers are trying to improve the autonomy, manoeuvrability and 
endurance of UAVs. And regulations, particularly in the United States, 
place strict limits on where and how researchers can use the devices. 
If these rules loosen up — and there are signs that they may — flying 
science robots may start taking to the skies in much greater numbers. 


LOFTY HEIGHTS 

The drones used by military forces to hunt down enemies have attracted 
growing scrutiny in recent years, but some of them have also been used 
for science. NASA has conducted hurricane and climate studies with 
Northrop Grumman’s Global Hawk, which can reach an altitude of 
nearly 20 kilometres — much higher than commercial planes fly. The 
agency got the drone for free from the Air Force, but interested scientists 
must be prepared to pay US$20 million for such a craft — no sensors 
included. 

Most researchers have to make do with much smaller and cheaper 
systems. A radio-controlled fixed-wing UAV such as the Tempest can be 
bought off the shelf for a few thousand dollars. And quadrotor helicop- 
ters can be purchased for just $300. Slap on a few sensors, an autopilot 
and a cheap computer preloaded with algorithms, and researchers have 
an unmanned aerial system (UAS). 

Despite the differences in equipment, military and civilian drone- 
research programmes have been closely linked, with advances flowing 

between the two sides. Many university UAV pro- 


> NATURE.COM grammes are, in fact, part-funded by the military. 
To see robot For now, most researchers working with drones 
quadrotorsplaythe are focused on improving the technology to make 
James Bond theme: — the devices more agile, more autonomous and bet- 
go.nature.com/mmhn94 — ter able to work in groups. Autonomy requires a 
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AUSTRALIAN CENTRE FOR FIELD ROBOTICS 


suite of algorithms to interpret data from sensors, make decisions about 
where to fly, control the plane’s path and classify objects captured by the 
UAV’s cameras. And all of that computing has to happen in real time 
on tiny, light computers bouncing around in three-dimensional space. 

One area of focus is vision-based navigation. Systems that rely on 
the GPS can achieve little better than 3-metre resolution at best — fine 
for open outdoor landscapes, but not good enough for urban areas or 
indoor settings. Drone developers would like to send their machines 
into earthquake-damaged buildings to look for survivors, which would 
mean avoiding errant beams, power lines and closed windows. To do 
this, an aircraft requires a complex system of cameras, gyroscopes and 
accelerometers to figure out where it is — and where the obstacles are. 

A team led by Ashutosh Natraj, now at the University of Oxford, 
UK, has taught drones with fish-eye cameras how to ‘find’ themselves. 
The robots’ algorithms divide up the circular visual field into sky and 
ground, identify a horizon line between the two and then derive the 
drone’s altitude and orientation. For city flying, the team is writing algo- 
rithms that recognize and use the verticals and horizontals of buildings 
and streets as guides to navigate up and down, forwards and backwards. 
At night, the drone could project a laser pattern onto its surroundings 
to orient itself. Camera-based navigation is smart, says Natraj, because 
a single camera collects more quality information than a number of 
expensive, heavy sensors such as laser range-finders, obviating the need 
to integrate many sensors. As part ofa three-year project to design UAVs 
that can deliver medical care after natural disasters, Natraj is developing 
systems to do all the image processing on-board the helicopter, rather 
than through a wireless connection to a separate computer. 

The Oxford disaster-relief UAV project is hoping to develop mul- 
tiple UAVs that talk to one another. Such research on swarms is a hot 
area, says Hyunchul Shim, director of the Center of Field Robotics for 
Innovation, Exploration and Defense at the Korea Advanced Institute 
of Science and Technology in Daejeon, South Korea. “If you go fast, go 


ae 


alone. If you go long, go together,” he says. 
Data collection and search-and-rescue mis- 
sions are faster and more efficient with a team 
of drones to pool data and provide redun- 
dancy in case some machines fail. But the use of more vehicles also adds 
complexity. Drones working together have to be able to communicate 
with one another and make collective decisions. 

Researchers are also focusing on increasing the endurance of UAVs, 
most of which are fuelled by gas engines and batteries. To keep the 
weight and costs down, researchers often use tiny drones with limited 
fuel capacity, which means short flights. Some groups are working on 
miniaturizing batteries, others on making the planes smart enough to 
take advantage of thermal updrafts and wind features, as birds and glid- 
ers do. Roland Siegwart, head of the Autonomous Systems Lab at the 
Swiss Federal Institute of Technology Zurich, has a team developing 
solar-powered planes that would never have to land. “I call them ‘low- 
flying satellites; he jokes. They could actually work better than satellites 
for collecting data, because researchers could direct them. “You can 
have an up-to-date image of bush fires, move them over illegal logging 
operations or look for people lost in the ocean,” Siegwart says. 


An autonomous research 
drone in Australia sprays 
herbicide onto weeds. 


VIDEO STARS 

Teams working on UAVs tend to keep abreast of each other’s work 
through videos posted online. The field’s biggest YouTube ‘star’ is Vijay 
Kumar at the University of Pennsylvania in Philadelphia. Kumar's group 
controls quadrotor helicopters indoors with a modified Vicon system — 
the motion-capture system used in Hollywood and by the video-game 
industry. His videos show drones flying in tight formation, transporting 
two-by-fours, and even — in one video with more than 3 million views — 
performing the James Bond theme on multiple instruments. “The Inter- 
net has changed the rules,” says Shim. And, Siegwart says, “It also spreads 
the information a little further, which helps attract good students.” 
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With new talent helping to make drones smarter and cheaper, the 
regulations that control unmanned flight will be the biggest barrier 
to their expanded use in research. “This is still the major issue,’ says 
Siegwart. 

That is particularly true in the United States, where Federal Avia- 
tion Administration (FAA) rules make it laborious to get permission 
to fly drones outside (except for non-commercial hobbyists, for whom 
the rules are looser). “We need permission to go out into a field on 
campus and fly something that 
is six inches off the ground, 
grouses Eric Frew, head of the 
Research and Engineering Center 
for Unmanned Vehicles at UC 
Boulder. “It is a one-size-fits-all 
approach” 

The FAA, based in Washing- 
ton DC, requires that would-be 
drone operators apply for and 
receive one of two certificates for 
their research programme if they 
want to fly their UAVs outdoors. 
The applications request a lot of 
information, “so the FAA can 
determine if the operation can 
be conducted without hazard to 
other aircraft or people and prop- 
erty on the ground”, according 
to the agency’s communications 
office. This means that certifica- 
tions are not granted for flights in cities and other crowded areas. These 
certifications are also limited to a 20-mile-square area (around 32 square 
kilometres), so when the UC Boulder team took Tempest storm-chasing 
across swathes of the country, they needed 59 separate permissions. 

Once a certification is obtained, usually within 60 days, a group can 
fly its aircraft during daylight hours in the designated spot for a year or 
two as long as they filea NOTAM — a Notice to Airmen — in advance 
with the FAA every time they want to fly. 

Each flight also requires a certified pilot. During the March test flight, 
that was Stachura, who spent most of the test staring into the plane's 
controls on his tablet. The FAA also requires that an observer be on 
hand to watch for potential collisions and that someone be monitoring 
the radio from the local airport. 


FLIGHT DELAYS 

Eric Johnson, who studies UAVs at the University of Georgia in Athens, 
has looked at regulations around the world and says that “among NATO 
countries, the United States is about the worst”. But as long as there are 
no accidents, the consensus seems to be that the regulations will loosen. 
The FAA Modernization and Reform Act, which passed last year, calls 
for the US Department of Transportation to produce a plan by late 2015 
for “the safe integration of civil unmanned aircraft systems into the 
national airspace system”. 

By contrast, says Johnson, Australia and Canada allow the most types 
of operation, perhaps because both countries have a lot of airspace and 
smallish bureaucracies. Salah Sukkarieh, who studies robotics and 
intelligent systems at the University of Sydney in Australia says that the 
country’s liberal regulations are allowing the UAS field to grow there, 
despite its funding being a fraction of that available to US scientists. 

Although most drone research has focused on improving the UAVs 
themselves, some scientists have been putting the devices to use. In 
March, NASA used a small electric military drone, the Dragon Eye, to 
sample and photograph the noxious gas plume spewing from Turrialba 
Volcano near San Jose in Costa Rica. The team compared the Dragon 
Eye's measurements of sulphur dioxide to those made by the Terra satel- 
lite in an effort to calibrate the space-based readings. It would have been 
too risky to send a human pilot near the volcano, where there are strong 
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The Tempest in Colorado was designed to collect data from severe storms. 


updrafts and ash that could choke a plane’ engines. 

James Maslanik, a remote-sensing expert at UC Boulder, has been 
involved in a number of studies using drones to measure various quali- 
ties of sea ice in the polar regions since 2000. Here, too, UAVs can ven- 
ture into regions too dangerous for a manned aircraft. In the Arctic, 
Maslanik says, “we are flying these things at 100 feet off the ice, wind is 
80 knots, temperature of minus 40°C” 

At the opposite end of Earth, researchers from UC Boulder have used 
UAVs to measure jets of wind that 
scream down from the Antarctic 
plateau into Terra Nova Bay. Such 
measurements could help scien- 
tists to understand the dynam- 
ics of sea-ice formation around 
Antarctica, which creates dense 
salty water that sinks and helps 
to drive global ocean currents. 
“Nobody had an aircraft out there 
during winter when the winds 
are strongest and took measure- 
ments because the conditions are 
too extreme,’ says Maslanik. The 
data collected so far, he says, show 
unexpectedly complex wind pat- 
terns, including fierce, localized 
jets that push sea ice off shore and 
speed up its formation. 

Biologists are also starting to 
use UAVs in their field work. 
In India, the conservation group WWF is using drones to look for 
rhino poachers. Tom McKinnon, a retired engineer and managing 
director of InventWorks, a product-development firm in Boulder, 
is outfitting autonomous helicopters with nets to capture rare Mon- 
golian vultures so that scientists can attach transmitters and study 
their movements. 

On the plant side, Sukkarieh has developed a system using a fixed- 
wing UAV and a helicopter in tandem to locate weeds in remote range- 
lands and spray them with herbicide. And several groups are teaching 
drones how to tell one kind of plant from another, so that they can make 
maps of vegetation. Rather than purchase advanced sensors, which add 
weight and increase costs, Sukkarieh’s team is writing code to allow the 
UAV to map and classify vegetation using just GPS, a camera and an 
inertial measuring unit, which collects data on the position of the air- 
craft in space. The challenge of making trade-offs between sensors and 
weight has prompted Sukkarieh to think about designing UAV systems 
from scratch around their specific tasks, rather than just bolting sensors 
to an off-the-shelf aircraft. “What if the wings were sensors themselves?” 
he wonders. 

For researchers without engineering expertise, however, the available 
UAVs offer plenty of opportunities. The Scottish Environmental Protec- 
tion Agency, for example, purchased a drone in 2012 from Swiss com- 
pany senseFly to survey estuaries for algal blooms — something that is 
difficult to do on foot. Susan Stevens, a scientist at the agency, says “you 
can get involved and use the technology without being an expert in it”. 

Still, the best landings come with experience. As the UC Boulder team 
finishes up testing the Tempest, Mack, who was a UAV hobbyist before 
he joined this research team, gently sets the drone down on its belly 
in a patch of short grass. He picks it up in one hand to carry it back to 
the van. 

Everyone is relaxed, having spent most of the 40-minute test flight 
doing little more than watching the Tempest and enjoying the spring 
day. If this is the future of field research, it looks pretty easy. “If every- 
thing goes well, it is fairly boring,” Stachura acknowledges. “Because it 
is autonomous, right?” m 


Emma Marris is a freelance writer in Klamath Falls, Oregon. 
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Stem-cell researchers 
must stay engaged 


Recent developments have rekindled the ethical debate over human cloning. 
This is no time for complacency, caution Martin Pera and Alan Trounson. 


ast month, news ricocheted around 
L* world that reproductive biologist 

Shoukhrat Mitalipov and his col- 
leagues at the Oregon Health and Science 
University in Portland had used cloned 
human embryos to produce stem-cell lines 
specific to individual patients’. Although 
critics have since raised some problems with 
the paper, a preliminary enquiry indicates 
that the conclusions of the work still stand 
(see Nature http://doi.org/mnk; 2013). 

This formidable technical feat is poten- 
tially a key step towards developing replace- 
ment tissues to treat disease. Media coverage 
of the paper has also rekindled long-stand- 
ing controversies about human cloning, 
the use of human eggs and the destruction 
of human embryos. The achievement is a 


timely reminder that scientists must remain 
actively engaged in discussions about the 
ethics of using human embryos for research 
in cell biology and regenerative medicine. 
More than 1,000 embryonic stem (ES) 
cell lines have now been established world- 
wide. There has also been an exponential 
increase in the use of induced pluripotent 
stem (iPS) cells — a type of stem cell that can 
be made from reprogramming the cells of 
body tissues, such as the skin or blood, back 
to the embryonic state. As a result, many 
people question whether there is still a need 
to obtain ES cells from the ‘spare’ human 
embryos that are surplus to those needed 
for in vitro fertilization. In recent months, 
it seemed as if stem-cell research had finally 
moved on from the uncertainty over funding 
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and career prospects that has dogged it since 
1998. Indeed, a prominent lobbying group 
for human ES-cell research, the Coalition 
for the Advancement of Medical Research in 
Washington DC, closed its doors last month 
(see go.nature.com/teedqq). 

We believe, however, that enduring public 
concerns will inevitably resurface as stem- 
cell biology progresses. Also, new ethical 
challenges will need exploring — such as 
whether it is justifiable to produce human 
reproductive cells (or gametes) from iPS 
cells, and what they should be used for. 

To ensure that rational discussion among 
scientists, policy-makers, regulators and the 
public precedes the formulation of regula- 
tory policy, individual researchers should 
try to avoid confronting the public with > 


13 JUNE 2013 | VOL 498 | NATURE | 159 


> controversial scientific leaps out of the 
blue. Instead, scientists should gather to dis- 
cuss the present and future course of human 
embryo research. They should also help to 
establish a formal programme for public 
consultation, similar to that led by two of the 
UK research-funding councils for synthetic 
biology. Ultimately, researchers should be 
prepared to continue arguing the case with 
governments and regulatory bodies when 
research on human embryos and ES cells is 
scientifically and medically merited. 


FUTURE PROOF 

Neither the current availability of numerous 
established ES-cell lines nor the increasing 
use of iPS cells rules out the possibility that 
human embryos will be needed in regenera- 
tive medicine in the future. 

Clearly, the impetus for deriving new 
stem-cell lines from embryos has declined. 
More than 200 ES-cell lines are now on 
the US National Institutes of Health regis- 
try, meaning that anyone with the agency's 
funding can use them. Meanwhile, more 
than 1,200 lines are on the widely used 
International Stem Cell Registry, along with 
information about where to obtain them. 

To produce new human ES-cell lines, 
researchers must negotiate a complex set of 
regulatory and compliance hurdles, obtain 
tens to hundreds of high-quality spare 
embryos, and do lab work that is labour- 
intensive, time-consuming and expensive. 
With so many well-characterized lines avail- 
able, there is little incentive to derive new 
lines using established technology. In fact, 
an analysis in 2009 revealed that around 70% 
of the published research on human ES cells 
is based on two cell lines’. 

Yet the technology being used to derive 
and propagate cell lines does have shortcom- 
ings. For example, some cell lines are hard 
to renew and expand; about one-quarter of 
them develop genetic abnormalities in vitro 
after a routine period of cultivation’; and, in 
many cases, it is still very difficult to convert 
stem cells into fully mature functional cells, 
such as heart or liver cells. 

Within the next decade, it is possible that 
technological improvements in the deriva- 
tion and maintenance of ES cells will enable 
researchers to overcome deficiencies in cul- 
ture systems. If this occurs, the use of new 
embryos to derive superior stem-cell lines 
might well be justified. 

Similarly, it is unclear whether iPS cells will 
render ES cells redundant, despite the fact 
that the use of iPS cells has shot up in recent 
years (see ‘Shifting preferences’). For example, 
there are still question marks over the genetic 
integrity of iPS cells and whether they dif- 
ferentiate into cells that are useful for thera- 
peutic purposes as robustly as ES cells do. For 
instance, the skin cells of an adult, from which 
iPS cells can be derived, may have already 
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SHIFTING PREFERENCES 


In recent years, research on human induced pluripotent stem (iPS) cells has grown rapidly, whereas studies 
of human embryonic stem (ES) cells seem to have plateaued. (Data include reviews and research articles.) 
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accumulated troublesome mutations, and the 
effects of reprogramming on the genetics of 
somatic cells are still being debated”. 

Research groups in Japan and California 
are rapidly pushing studies on iPS-cell 
derivatives towards the clinic to treat age- 
related macular degeneration (a major cause 
of blindness) and genetic skin disorders, 
among other diseases. But the therapeutic 
potential of iPS cells relative to that of ES 
cells will be known only when the safety 
and efficacy of both cell types has been 
thoroughly evaluated in preclinical animal 
models and in early-stage clinical trials. 

Even if iPS cells do make ES cells obsolete, 
new directions in research using iPS cells 
could fuel more lively debate among the sci- 
entific community, regulators and the public 
than has been spurred so far by work on ES 
cells. Healthy mice, for example, have been 
produced from fertilized eggs derived from 
iPS cells”. If it becomes possible to make 
human gametes from iPS cells, these could 
have many uses: to study the basis of human 
infertility; to identify factors present in the 
egg that might enhance its reprogramming 
for stem-cell lines; to produce embryos in vivo 
and in vitro for treating human sterility; or 
even to genetically modify the germline to 
prevent disease. 

These possibilities may be even more 
ethically challenging than the use of spare 
embryos from in vitro fertilization to make ES 
cells. So far, such prospects have hardly been 
mentioned in the public arena. 


CLONING REBORN 

One procedure that has generated much ethi- 
cal controversy is somatic cell nuclear trans- 
fer (SCNT), or ‘therapeutic cloning’ Used by 
Mitalipov and his colleagues, this involves 
transferring the nuclear genome of an adult 
body cell, such as a skin or liver cell, into an 
unfertilized egg from which the nucleus has 
been removed. After ‘tricking’ the egg into 
becoming an embryo — by mimicking the 
chemical changes triggered by fertilization 
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— researchers can produce ES-cell lines that 
are genetically matched to the original donor. 

Biologists initially envisioned SCNT as 
a way to produce patient-specific tissues 
needed for transplantations. Currently, it is 
permitted in only a handful of jurisdictions, 
including Britain, Australia, China, Califor- 
nia, New York and Oregon. But until Mital- 
ipov’s breakthrough last month, no one had 
managed to convincingly produce ES cells 
from human cells using SCNT®. 

The procedure is technically challeng- 
ing, and a major stumbling block has been 
the need for numerous mature human eggs. 
In fact, interest in SCNT in humans waned 
substantially after the discovery of iPS cells, 
as measured by a decline in the number of 
SCNT papers and researchers working on it. 

The findings of Mitalipov and his col- 
leagues — or the future discovery of a way 
to derive hundreds of mature eggs from ES 
cells or iPS cells — could revive work on 
SCNT. Several biologists have argued that 
transferring nuclei from the somatic cells 
of humans to the eggs of another species, 
such as those from the frog Xenopus laevis, 
might be a powerful tool for understanding 
the reprogramming process in human cells’, 

Even now, there is a compelling case for 
using SCNT-related technology in at least one 
clinical setting. Some mutations in mitochon- 
drial DNA are associated with several poten- 
tially fatal disorders of the cardiovascular 
and nervous systems. Two studies from the 
past year show that the transfer of the hap- 
loid genome from an affected person into a 
healthy donor egg can prevent the inheritance 
of such mutations in cultured human ES-cell 
lines*’. And Mitalipov and his co-workers 
point out that SCNT-derived cell products 
could be used to treat patients with mitochon- 
drial diseases. Such proof-of-concept studies 
may represent the best hope yet for patients, 
or for women who are at risk of passing on a 
mitochondrial disease to their children. 

We think that SCNT should be permit- 
ted to facilitate experiments in vitro, but 
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not to develop a live human or human- 
animal hybrid. The use of SCNT-related 
technology to treat mitochondrial diseases 
does not involve cloning, but it does raise 
the question of whether it is acceptable for 
children to have three genetic ‘parents’: the 
mother who donates the egg nucleus, the 
father who donates the sperm nucleus and 
another woman who donates the mitochon- 
drial DNA. The UK Human Fertilisation 
and Embryology Authority recommended 
in March that the government authorize the 
use of this technique to help these patients. 
We agree with this judgement. 

Such possibilities need careful considera- 
tion and public consultation. We believe that 
the scientific community, which was forced 
to engage in ethical discussions in the early 
stages of stem-cell biology, should lead the 
way. Asa first step, scientific academies such 
as the US National Academy of Sciences or 
the Australian Academy of Science should 
organize symposia to foster debate on the 
ethical ramifications of recent advances and 
possible new breakthroughs. Scientists should 
also engage with the public and the broader 
medical community; for instance, by collabo- 
rating with patient advocate groups such as 
the UK Juvenile Diabetes Research Founda- 
tion, and health-care providers such as the 
UK National Health Service. This would 
enable scientists to keep abreast of people's 
concerns, and to inform stakeholders of the 
realistic benefits and limits of their research 
and the ethical challenges it may bring. 

The potential benefits of stem-cell 
research are immense. Prospects for trans- 
formative treatments for conditions such as 
macular degeneration, type 1 diabetes or 
Parkinson's disease are now on the horizon. 
But without first convincing governments, 
the public, and funding and regulatory bod- 
ies that all the possibilities have been thought 
through and evaluated, headline-catching 
results could create a backlash that unnec- 
essarily delays the tremendous potential 
benefits of cell therapies. m SEE NEWS & VIEWS P.174 
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Observers in Amman, Jordan, watch the transit of Venus across the Sun in June 2012. 


Time for an 
Arab astronomy 
renaissance 


Arab Muslim countries need a new generation of 
observatories to rejoin the forefront of the field, 
says Nidhal Guessoum. 


age from the ninth to the sixteenth 
century AD. Great observatories in 
Baghdad, Damascus, Maragheh, Samar- 
qand and Istanbul mapped the sky to set 
dates for religious and civil festivals and for 
astrology. Sophisticated calculations and 
models led to advances in mathematics. 
Today, Arab astronomy barely registers 
on the world map. Scientific research is 
weak across the Arab world, and astron- 
omy weaker still. Unlike countries of 
comparable gross domestic product per 
capita, such as Turkey, Israel and South 
Africa, most Arab nations are generating 


[es astronomy enjoyed a golden 
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fewer than ten papers in the field each year, 
and these are hardly cited. Few sizeable 
telescopes are operational or planned. 
The lagging state of astronomy is a par- 
adox for a region where funding should 
not bea serious constraint, at least in the 
wealthier Gulf states. The region has sev- 
eral excellent observing locations above 
2,000 metres that benefit from clear skies. 
Public fascination is strong, as shown by 
the many local amateur associations and 
large gatherings for astronomical events, 
such as eclipses, comet passages or the 
most recent transit of Venus across the 
Sun in June 2012. > 
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> In my view, astronomy research is 
being neglected because of the strongly utili- 
tarian Arab Muslim approach to science’. 
Cultural principles, such as serving the peo- 
ple first, led Arab nations to build bases in 
the applied sciences in the second half of the 
twentieth century, including petrochemical 
engineering and pharmaceuticals. There 
was also a need for the region to develop its 
infrastructure quickly 


after the departure “Funding 

of colonial powers. should not 
Today, subjects such be aserious 

as theoretical physics constraint, 

are taught widelybut gfleastinthe 
are lowcost and are weglthier Gulf 
considered low prior- states.” 


ity. Astronomy seems 
to require expensive buildings, equipment 
and technicians for little tangible return. 

Another problem is the lack of exper- 
tise in the management of large scientific 
projects — an essential element if obser- 
vatories and research centres are to operate 
effectively. The few large telescopes that have 
been built in the region in the past 50 years 
have been poorly run, are often inoperable 
and have produced few results. 

I call on Arab countries to build a new 
generation of observatories. A few medium- 
sized telescopes (one- to two-metres in mir- 
ror diameter) costing a few tens of millions 
of dollars would allow Arab astronomers 
to join front-line research by searching for 
supernovae, the afterglows of y-ray bursts, 
variable stars and extrasolar planets. 
Universities need to set up degree and 
international exchange programmes in 
astronomy to train and integrate the next 
generation of Arab astronomers. Such 
developments would galvanize academic 
and public interest in fundamental science 
across the region. 


AGOLDEN PAST 

Astronomy had a central place in society 
from the early times of Islamic civilization. 
In the early ninth century, a few decades after 
the founding of Baghdad as the capital of the 
new Muslim empire, the caliph al-Mamun 
(AD 786-833) ordered the erection of 
two observatories: Shammasiyya near 
Baghdad, and Jabal Qasiyun on the high 
outskirts of Damascus. Their main aim 
was to check solar and lunar data in old 
Greek and Indian tables, and to produce 
civil and religious calendars. Facilities 
included a quadrant made of marble with 
a radius of five metres to measure angles on 
the sky, and a sundial with a central gnomon 
— the column that casts the shadow — more 
than five metres high. 

Islamic practice relies on astronomy for 
three purposes: computing prayer times 
for various locations and dates, which are 
based on the apparent motion of the Sun; 
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determining the direction to Mecca (the 
Qibla) for prayers; and establishing the dates 
for holy festivals, particularly Ramadan (the 
month of fasting) and Hajj (the pilgrimage), 
which are set by the observation of the thin 
crescent of the new Moon. All three still 
cause heated arguments among Muslim 
astronomers and scholars. 

Historically, astronomy was also needed 
for navigation at sea and on land. Travel- 
lers and sailors learned that the arc of the 
Moon indicates the east-west line; the short- 
est shadow of a stick gives the north-south 
direction; the height of Polaris (the Pole Star) 
above the horizon gives the latitude of the 
place; and Mintaka, a star in Orion's belt, 
traces the celestial equator. 

Muslim rulers were also guided by astrol- 
ogy, believing that some days were more pro- 
pitious than others for mundane activities 
or stately decisions. Astronomers ability to 
predict planetary motions and alignments, 
eclipses and new and full Moons was a pow- 
erful weapon ina ruler’s arsenal. Courts had 
a resident astronomer, and mosques had a 
time-keeper (muwagqqit). 

By the thirteenth century, rulers were 
erecting great observatories such as 
Maragheh (in present-day Iran), which 
was the largest in the world at the time. 
Astronomers and students from around 
the world used its sophisticated instru- 

ments, which included an armil- 
lary sphere model of celestial 


An eleventh-century 
astrolabe, used to 
measure celestial 
positions, among 
other functions. 
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body motions several metres wide, as well 
as its library of 400,000 books. Theories 
developed there include the “Tusi cou- 
ple that links linear and circular motion, 
which was developed by the astronomer 
Nasir al-Din al-Tusi in 1247, and later used 
by Nicolaus Copernicus in his geometry of 
planetary orbits. 

In the fifteenth and sixteenth centuries, 
even more stunning observatories were built. 
In the Samarqand observatory (completed 
in 1429; now known as Ulugh Beg Observa- 
tory), a 30-metre-high building housed ten 
instruments. These included an armillary 
sphere; an azimuthal quadrant for measur- 
ing the horizontal angle of the star from the 
north; and a meridian arc with a 40-metre 
radius, which measured celestial positions to 
within a few arcseconds. The Istanbul obser- 
vatory, built in 1577, although smaller, also 
housed ten instruments and had 15 full-time 
astronomers’. Sophisticated tables giving the 
positions of stars, planets, the Sun and the 
Moon were produced in each. 

Thus hundreds of stars and constellations 
have Arabic names, such as Altair, Deneb, 
Vega and Rigel. Today, more than 20 lunar 
craters bear the names of Muslim astrono- 
mers, including Alfraganus (al-Farghani), 
Albategnius (al-Battani) and Azophi 
(al-Sufi). The scholar Abu Rayhan al-Biruni 
(aD 973-1048) used astronomy and trigo- 
nometry to determine Earth's circumference 
to within 0.3% of today’s accepted value. 
Muslim women participated too: in the tenth 
and eleventh centuries, Fatima of Madrid, 
daughter of the great Andalusian astrono- 
mer Maslama al-Majriti, helped her father to 
produce tables of star and planet positions. 
In the tenth century, Mariam of Syria was a 
skilled constructor of astrolabes for celestial 
surveying. 

From the thirteenth century onwards, 
major centres of learning were lost, such as 
those in the Iberian territory of Al-Andalus, 
and conservative rulers and clergy accorded 
religious knowledge an ever higher place 
than worldly science. Universities disap- 
peared and old places of learning became 
antiquated and disconnected from scien- 

tific developments in Europe. Observa- 
tories were seldom gifted rich, religious 
endowments (awgaf) and thus rarely 
continued for more than a few years or 
decades after their establishment. 
Thus the era of great Islamic obser- 
vatories came to an end in the later 
part of the sixteenth century, with the 
demise of the Ottoman empire and the 
rise of European science. The practice of 
astronomy, as with other areas of science 
at the time, depended on the good will of 
the caliph or patron. The Istanbul observa- 
tory was destroyed in 1580, less than three 
years after its construction, by a new ruler 
who had been convinced by the religious 
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The Ulugh Beg Observatory in Samarqand, Uzbekistan, completed in the fifteenth century, was used by several famous Islamic astronomers. 


establishment that “prying into the secrets 
of the heavens” was reprehensible and would 
trigger God’s anger’. 

As a result, no astronomy and little science 
were conducted in Muslim countries until 
the late-nineteenth century. 


ARAB ASTRONOMY TODAY 

Things got going again when European 
powers — Britain and France, in particular 
— colonized many parts of the Arab Muslim 
world, bringing modern ideas with them, 
but education to only a select few. 

For instance, the Lee AstroPhysical 
Observatory in Lebanon, named after its 
British merchant patron, Henry Lee, was 
built in 1873 by Cornelius Van Alen Van 
Dyck, a passionate professor of astronomy 
at what later became the American Uni- 
versity of Beirut. The observatory housed a 
25-centimetre telescope, which worked well 
enough until the facility closed in 1980. 

In 1891, French astronomers built an 
observatory on the hilltops overlooking 
Algiers; it contributed 1,260 photographic 
plates of the sky between 1891 and 1911 to 
the Astrographic Catalogue project, a large 
international effort to map star positions 
to a high degree of accuracy. In Egypt, the 
Helwan Observatory was built in the 
early twentieth century**; an astronomy 


department was established at Cairo Univer- 
sity, and the country joined the International 
Astronomical Union (IAU) in 1925 (ref. 5). 

Sadly, a world map of today’s observatories 
shows just two medium-sized telescopes in 
Arab countries: Egypt and Algeria. By com- 
parison, South Africa has half a dozen big 
observatories, including the South African 
Large Telescope (SALT) with an 11-metre 
primary mirror — the largest single optical 
telescope in the Southern Hemisphere. India 
has at least a dozen observatories, including 
the Indian Astronomical Observatory at 
Hanle, which houses a two-metre telescope. 

The largest telescope to have graced the 
Arab world is the 1.88-metre instrument at 
Egypt's Kottamia Observatory, in the desert 
75 kilometres outside Cairo. The telescope 
was inaugurated in May 1964, but for dec- 
ades it was under-used or broken. Refur- 
bished in the 1990s, Egyptian astronomers 
say that the telescope is now working, 
although few papers have resulted from it. 

In Iraq, an ambitious plan to build a 
world-class observatory in the northern 
high mountains was launched in the 1980s, 
envisaging 3.5- and 1.25-metre telescopes, 
along with a 30-metre radio telescope®. Wars 
and their resulting damage meant that the 
project was never finished. Plans to relaunch 
it have been aired, without progress. 
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In the past few years, two small obser- 
vatories have been constructed in other 
parts of the Arab world. At an altitude of 
2,750 metres, the Oukaimeden Observa- 
tory near Marrakesh in Morocco hosts a 
50-cm robotic telescope for asteroid and 
comet searches. It is run by the Cadi Ayyad 
University in Marrakesh in collaboration 
with Uranoscope de I’'Ile de France (a French 
amateur astronomy association) and the 
Marrakesh Amateur Astronomy Associa- 
tion. Another observatory in Lebanon, built 
by Notre Dame University in Louaize, con- 
tains a 60-cm telescope, which is expected 
to begin operating soon. Other Arab coun- 
tries have smaller telescopes, with mirrors 
of 35-50 cm. 

Several Arab states have proposed one- to 
two-metre telescopes over the years, includ- 
ing Algeria, Libya, Oman, Saudi Arabia and 
the United Arab Emirates, but little progress 
has been seen. 


RESEARCH ANALYSIS 

To assess how badly astronomy research is 
suffering in the region, I compared publica- 
tion and citation data for Arab nations with 
data from Iran, Israel, South Africa and 
Turkey (see ‘Arab astronomy papers’). Arab 
astronomers published fewer papers and had 
fewer citations than astronomers in those 
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PUBLICATION DATA 


Arab astronomy papers 


To assess the state of Arab astronomy 
research, | used the Thomson Reuters Web 
of Science to extract publication data for 
astronomy and astrophysics papers for 
each Arab country from 1 January 2000 
to 31 December 2009. For comparison, 

| collected similar data for authors from 
Turkey, lran, Israel and South Africa. 

Because there were few papers for 
Arab countries, | examined them by hand 
and discarded ones on tangential and 
highly theoretical topics. The comparison 
countries had a greater number of papers, 
so | examined a random sample of 200 
papers from each country and scaled the 
totals accordingly. For Arab countries, 
40-50% of papers were excluded 
(reflecting the emphasis on theoretical 
work); for Iran, the percentage was 78%; 
for Israel, 25%; and for South Africa, 19%. 

The number of astronomy papers as 
a proportion of science papers for the 
Arab world is 3 per 1,000 (ranging from 
1 for Qatar to 6 for Bahrain; Yemen has 
an abnormally high ratio owing to its very 
ow science production). This is similar to 
ran (2) and Turkey (3), but much lower 
than Israel (14) and South Africa (24), the 
proportions of which are similar to those of 
the United States, China, India, Japan, Brazil 
and Spain. (In these countries, the range 
is 10-25 astronomy papers per 1,000 
publications.) 

The United Arab Emirates, for example, 
with a population of 8 million and a gross 
domestic product (GDP) per capita of 
US$46,000 in 2011, published 6,000 
science publications over 10 years, but only 
23 of those were in astronomy. Israel, with 
a similar population but a 30% lower GDP 
per capita than the United Arab Emirates, 
published a total of 143,000 scientific 
papers, of which 1,500 were astronomy 
articles. Similarly, of the 13,000 science 


other four countries. The entire Arab world 
published fewer astronomy papers than 
Turkey alone, and substantially fewer than 
South Africa or Israel. Citation figures are 
worse: Arab astronomy papers were cited 
less often than Turkey’s, South Africa’s or 
Israel's. 

As for degree programmes in astronomy 
or astrophysics at Arab universities, these 
can be counted on two hands. Small pro- 
grammes exist in Egypt, Jordan, Lebanon, 
Saudi Arabia, Sudan and Algeria. Only a few 
dozen out of several million students major 
in astronomy or astrophysics at undergrad- 
uate or at master’s level, and home-grown 
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PAUCITY OF PAPERS 


Arab countries produce fewer astronomy 
papers than nations with similar GDPs. 
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papers that were published by authors 
from Lebanon, which has a population of 
4 million and a GDP per capita of $9,000, 
just 19 were in astronomy. 


The citation figures are even more 


striking. For publications in 2000-09, there 
were 1,507 citations for papers that had a 
first author from an Arab country and 1,596 
for papers that include an author from an 
Arab country (but not a first author). This 

is a total of 3,103 citations, compared to 
4,355 for Turkey, which contains one-fifth 
of the population of the Arab world. Israel’s 
and South Africa’s citation figures were 

20 times and 9.5 times higher, respectively, 
than those of the Arab world. 


PhD students are rare. 

Conferences, colloquia and summer 
schools in astronomy are organized, but 
with modest academic impact. The Arab 
Union for Astronomy and Space Science 
(AUASS), a supranational organization link- 
ing professional astronomers and amateur 
associations of the Arab world, holds meet- 
ings every two years, but it has not published 
any proceedings. 


LOOKING FORWARD 

It is time for governments, funding agencies, 
science-advocacy organizations and uni- 
versities of the Arab world to move beyond 
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the utilitarian view of science and promote 
professional astronomy. 

Large projects in this field can inspire 
the science and technology community, 
the education sector and the public, and 
shift attitudes towards basic research 
in general. 

This can be done by accelerating efforts 
to build high-class observatories, with 
one- and two-metre telescopes in several 
countries; establishing astronomy pro- 
grammes in all public universities; setting 
up exchange agreements with interna- 
tional institutions; and funding graduate 
students to pursue doctoral programmes at 
universities abroad. 

The Arab world offers ample sites for 
high-quality observatories — several 
mountains have peaks higher than 
3,000 metres. Mountain ranges in the 
Arabian peninsula that span the United 
Arab Emirates, Oman, Saudi Arabia and 
Yemen typically enjoy 200-250 clear nights 
a year. Peaks in the Sinai peninsula reach 
up to 2,600 metres, where at least 150 sum- 
mer nights are clear. Similar suitable sites 
exist in several other countries, from Iraq 
to Morocco. 

Rich Gulf states could work together to 
set up a world-class observatory. A facil- 
ity would cost between US$50 million and 
$100 million, including equipment (a two- 
metre telescope, photometer, spectrom- 
eter, fast computers and network links, 
and a weather station), buildings with 
work and meeting rooms, sleeping quar- 
ters and leisure areas, and local roads and 
infrastructure. 

Arab universities should cooperate. 
Expert meetings should be convened to pro- 
duce white papers on restarting astronomy 
in the region. These activities should be sup- 
ported by international organizations such 
as the AUASS and the LAU to put pressure on 
governments. And it is essential that major 
Arab universities offer degree programmes 
in astrophysics. 

Astronomy hasa natural place high in the 
landscape of Arab Islamic culture. It must be 
brought back. = 


Nidhal Guessoum is professor of physics 
and astronomy at the American University 
of Sharjah, United Arab Emirates. 
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New words on the wild 


Robert Macfarlane reflects on the recent resurgence in nature writing. 


ing,’ rages the hero of The Monkey 

Wrench Gang (1975) about the wil- 
derness. Edward Abbey’s novel went on to 
inspire the 1980 formation of eco-activist 
group Earth First!, whose members under- 
took direct-action campaigns aimed at 
preserving wild places. “No compromise in 
defence of Mother Earth!” was the group's 
original pledge. 

Abbey’s novel was fuelled, like his memoir 
Desert Solitaire (1968), by fury at the despo- 
liation of the landscapes he loved. The book 
formed part of an extraordinary surge of 
writing about nature in North America in 
the 1970s and 80s. Annie Dillard's visionary 
Pilgrim at Tinker Creek (1974), about life 
and death in the Blue Ridge Mountains of 
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Virginia, won the Pulitzer Prize in 1975; 11 
years later, Barry Lopez's exploration of the 
American far north, Arctic Dreams (1986), 
became an instant classic and long-term 
best-seller. 

Over the past 15 years, Britain has expe- 
rienced a comparable surge with the blos- 
soming of a literary form that has become 
known as ‘new nature writing. The tone 
of this form, however, feels far from the 
roustabout activism of Abbey, or even the 
puckishness of Dillard. The genre is distin- 
guished by its mix of memoir and lyricism, 
and specializes in delicacy of thought and 
precision of observation. A number of these 
books have drawn massive acclaim from 
critics and readers: Roger Deakin’s Waterlog 
(1999), a sparkling account of his ‘swimmer’s 
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journey’ through the lakes and rivers of the 
British Isles; William Fiennes’s The Snow 
Geese (2002), about flight, home and happi- 
ness; Nature Cure (2005) by Richard Mabey, 
who describes the role of the outdoors in his 
recovery from depression; and Olivia Laing’s 
To The River (2011), a fluent meditation on 
place and memory inspired by the River 
Ouse in Sussex. 

It is widely agreed that Britain is going 
through a golden age of nature writing, 
but no one seems sure quite how to define 
it. Ragtag, wayward and polymorphous, it 
folds in aspects of memoir, travel, ecology, 
botany, zoology, topography, geology, folk- 
lore, literary criticism, psychogeography, 
anthropology, conservation and even fiction. 
Most distinctive, to my mind, is its tonal mix 
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of the poetic and the scientific and analyti- 
cal. The Snow Geese, for instance, combines 
exquisite accounts of geese in flight — “the 
flock lifted from the field as a single entity, 
10,000 pairs of wings drumming the air, as 
if people were swatting the dust from rugs” 
— with an inquiry into the biomechanics of 
avian migration. 

Even this aspect of new nature writing, 
though, is hardly new. In the decade after 
the Second World War, US scientists Rachel 
Carson and Loren Eiseley, and pioneering 
conservationist Aldo Leopold — author of 
A Sand County Almanac (1949) — became 
famous for the intimate tone and ethical 
commitment of their essays. They were 
“imaginative naturalists”, to borrow the 
subtitle of Eiseley’s million-selling The 
Immense Journey (1957). Their appeal lay in 
their fusion of the latest research with first- 
person narrative. 


CONTINENTAL DIVIDE 

North America has always been a happier 
habitat for nature writing than Britain, per- 
haps because the vast and various geography 
of the continent — from southern canyon 
lands to northern polar tundra — has pro- 
vided limitless inspiration. For much of the 
twentieth century in the UK, writing about 
wildlife or the countryside was regarded 
with suspicion tending to contempt. Stella 
Gibbons parodied rural writing in Cold 
Comfort Farm (1932): “Daisies opened in 
sly lust to the sun-rays and rain-spears, and 
eft-flies, locked in a blind embrace, spun 
radiantly through the glutinous light... And 
Evelyn Waugh skewered the plush prose of 
country diarists in his novel Scoop (1938), 
with sentences such as “Feather-footed 
through the plashy fen passes the questing 
vole”. 

For decades after Gibbons and Waugh, 
nature writing seemed in abeyance, dis- 
missed as either reactionary or soppy in 
its longing for oneness with the natural 
world. This didn’t prevent the emergence 
of occasional masterpieces: J. A. Baker's The 
Peregrine (1967), for instance, inspired — as 
in part was Carson's Silent Spring (1962) — 
by Derek Ratcliffe’s landmark studies into 
the effects of pesticides on eggshell-thin- 
ning in British raptors. There was also Nan 
Shepherd's glorious meditation on place and 
being, The Living Mountain (1977), born of 
years of acquaintance with the Cairngorm 
massif of north-east Scotland. 

Why, then, has nature writing enjoyed 
its recent renaissance in Britain? Two 
main causes suggest themselves. The first 
is disembodiment: people are spending 
increasing amounts of time in atmosphere- 
controlled environments, hunched at key- 
boards. An inevitable consequence of this 
has been a longing for wildness and nature: 
the feel of wind and sun upon the face, 


or the sight of a stooping falcon, or of an 
oak tree in spring leaf. Or, as Henry David 
Thoreau cried — having laid his hands 
on the summit rocks of Mount Ktaadn in 
Maine — for “Contact! Contact!” 

The second context is global crisis. Itis no 
coincidence that a literature celebrating the 
natural world should have emerged at a time 
when the natural world is so conspicuously 
under threat. The past 15 years have seen 
the Deepwater Horizon blowout in the Gulf 
of Mexico, the break-up of the Antarctic ice 
shelf, widespread habitat destruction, fur- 
ther evidence that we are living through the 
sixth great extinction pulse, and the slow- 
motion emergency of climate change. 

British nature writing is energized by 
this sense of menace and hazard. The bio- 
diversity crash has been tackled boldly 
by Melanie Challenger in On Extinction 
(2011), and wittily by Caspar Hender- 
son in The Book of Barely Imagined Beings 
(2012). Henderson records some of the 
extraordinary species with which we share 
the planet (including the mantis shrimp 
Gonodactylus, which “has the fastest geni- 
tals in the West and will use them to smash 
your head with massive force”). Mean- 
while, Esther Woolfson’s Corvus (2008) and 
Field Notes From A Hidden City (2013) 
examine interspecies relationships and the 
responsibilities we bear to the creatures that 
surround us (echoing Leopold's ‘land ethic’). 

A sense has emerged that nature writ- 
ing might almost provide salvation. “If 
we are to continue to live with birds about 
us we need bird poems as much as the 
RSPB [Royal Society for the Protection 
of Birds], notes ornithologist Tim Dee, 
author of The Running Sky (2009). And 
Margaret Atwood hopes her novels might 
“move public opinion in a more biosphere- 
friendly direction”. Nature writing to the 
rescue! 

I am less sure. The Anthropocene is a 
frightening and forceful era. I wonder, 
really, what literature can do in the face 
of population pressure, rising sea levels, 
deforestation and the rapacious instinct of 
capital. A few years ago a monograph by 
the American academic John Felstiner, was 
entitled Can Poetry Save The Earth? (2009). 
“No!” I yelled silently at the cover when I 
first saw it, “Of course it can’t!” On reading 
the book, I was curiously relieved to find 
that this was also Felstiner’s answer. 

There are good reasons not to exagger- 
ate the possible consequences of nature 
writing. One is that it often preaches to the 
converted: the people it reaches tend to be 
those with the most 


developed environ- ONATURE.COM 
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Klinkenborg observed, “seldom pick up this 
kind of writing, or submit to its evidence” 
Another reason is that nature writing can 
often feel too pious and gentle in its urgings 
— the green equivalent of attending Sunday 
school. 

And yet a law of unintended conse- 
quences has always governed literature, as 
The Monkey Wrench Gang made clear. Writ- 
ing that aims to provoke specific behaviours 
has a name: propaganda. And writing that 
seeks to provoke specific emotions also has 
aname: kitsch. Literature, by contrast, does 
not deal in deliverables. It stirs the sedi- 
ments of thought and morals, setting them 
strangely aswirl. 

“Transformation comes about as much 
because of pervasive changes in the depths 
of the collective imagination as because of 
visible acts, though both are necessary,” 
says Rebecca Solnit, one of today’s most 
interesting US essayists and environmental 
activists. “And though huge causes some- 
times have little effect, tiny ones occasion- 
ally have huge consequences.” The history 
of environmental literature is rich with 
such fascinating ‘transformations’: natural- 
ist John Muir’s influence on the founding 

of US national parks, 


“There or the thunderclap 
are good publication of Silent 
reasons not Spring, es its oe 
sequences for use o 
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Sele an (dichlorodiphenyl- 
of nature trichloroethane). As 


yet unmapped is the 
influence of Cormac 
McCarthy’s The Road 
(2006), a novel that chilled me to the core 
and that British journalist George Monbiot 
has described as “the most important envi- 
ronmental book ever written”. 

For literature possesses certain special 
abilities, very different to those of science. 
It can convey us into the minds of other peo- 
ple, and even — speculatively — the minds 
of other species. It can help us to imagine 
alternative futures and counter-factual 
pasts. It is content with partial knowledge 
in ways that science is not. Crucially it 
can, in author and environmentalist Bill 
McKibben’s phrase, make us feel things “in 
the gut”— fear, loss and damage, certainly, 
but also hope, beauty and wonder. And 
these last are, I think, the most important 
emotions in terms of our environmental 
future: our behaviour is more likely to be 
changed by promise than by menace. We 
will not save what we do not love. = 


writing.” 


Robert Macfarlane is a fellow of 
Emmanuel College, Cambridge, UK, and 
author of Mountains of the Mind, The 
Wild Places and most recently, The Old 
Ways. 
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TECHNOLOGY 


Built by bicycle 


Andrew Robinson mulls over a study of India’s adaptation of low-tech inventions. 


pher Henri Cartier-Bresson, two Indian 

men walk away down an empty, rural, 
palm-fringed road. Between them is an old- 
fashioned bicycle. One man grips the tip of 
a large, conical metal object perched on the 
saddle — the nose cone of a small rocket. 
The caption reads: “Near Trivandrum, 
Kerala. 1966. Preparing for a launch at the 
Thumba Rocket Equatorial Launching Sta- 
tion, housed in a former church.” 

That photograph of two space sci- 
entists encapsulates the thesis of 
Everyday Technology. This pioneering study 
by historian David Arnold examines India’s 
response to certain small-scale technolo- 
gies from the 1880s through to independ- 
ence in 1947 and up to the 1960s — long 
before the country’s digital revolution. 

During the colonial era, British officials 
in India tended to regard its population, 
particularly in rural areas, as too mired in 
conservatism, poverty and illiteracy to adopt 
new technologies. So officials preferred to 
introduce large-scale technological projects 
from the top down, for example electric tel- 
egraphs, railways and irrigation schemes. 
After 1947, this attitude influenced India’s 
first prime minister, Jawaharlal Nehru, who 
conducted an all-out pursuit of foreign- 
constructed hydroelectric dams and steel 
mills, and introduced nuclear power anda 
space programme. For Nehru, big dams were 
“temples of the new age” — emblems of an 
India untainted by its messy social reality. 

Arnold, by contrast, believes that under- 
standing technology demands an apprecia- 
tion of the society embracing it, “even when 
the technological goods themselves remain 
largely foreign”. He also argues that the slow 
spread of small-scale technologies, such as 
the sewing machine, prepared the way for 
India’s later adoption of more sophisticated 
ones. By domesticating imported inventions, 
colonial societies undergo self-transforma- 
tion, Arnold suggests. 

As Cartier-Bresson’s photograph hints, 
the bicycle in India has been a means of 
carrying people (sometimes three or four 
at a time), things and ideas. Even today it 
remains essential for 
millions of poorer 
Indians, who now may 
well also use a mobile 
phone. Bicycles have 
been converted into 
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cycle rickshaws and three- and four-wheeled 
carts; their basic mechanism has been used 
to power knife grinders and foot-powered 
looms. As Arnold shows, by the early twen- 
tieth century, the bicycle — along with 
three other low-tech mainstays, the sewing 
machine, the typewriter and the rice mill — 
were deeply woven into Indian society. 

In the early 1920s, India was import- 
ing almost 50,000 bicycles a year; by inde- 
pendence, the number was five times that. 
In 1948, during the final fast of Mahatma 
Gandhi, the great leader of India’s independ- 
ence movement, 5,000 cyclists converged at 
a house in Delhi to hear Nehru report on 
Gandhi's health, their cycle lamps glowing 


Henri Cartier-Bresson’s Indian space scientists. 


in the twilit garden like giant fireflies. 

Most of these technologies were opposed 
by Gandhi. But, as Arnold is at pains to 
detail, Gandhi's well-known aversion to 
machines was not down to Ludditism. It was 
based on serious thought, and had a strong 
influence on the development of the post- 
1947 cottage industries movement. 

Gandhi opposed the bicycle mainly 
because buying an imported luxury would 
lead to debt, although he permitted himself to 
travel by automobile. He had employed typ- 
ists in his legal practice in turn-of-the-century 
South Africa and learned to type. But when 
he returned to India in 1915, he declared 
the typewriter “a cover for indifference and 
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laziness”, preferring to 
write his voluminous 
output of letters and 


Everyday 
Technology: 
Machines and the 


Making of India’s articles by hand. 
pseny He objected to the 
DAVID ARNOLD . : : 
Chicago University rice mill — firstly 
Press: 2013. because it would 
224 pp. £21 deprive poor women of 


income from pounding 
rice, secondly because pounding was good 
exercise and thirdly because milling removed 
the vitamin thiamine from the pericarp of the 
rice grain, a deficiency of which causes the 
disease beriberi that affected parts of India. 
Often critical of Western medicine, Gandhi 
was happy in this case to accept scientific 
evidence from two colonial nutritionists, 
Robert McCarrison and W. R. Aykroyd, but 
ignored their argument that less rigorous 
milling would preserve sufficient levels of 
the vitamin to prevent beriberi. However, 
Gandhi famously advocated the spinning 
wheel, and (less famously) championed the 
treadle sewing machine, particularly Singer's, 
describing it as “one of the few useful things 
ever invented”. 

Everyday Technology organizes an 
enormous amount of unfamiliar detail ona 
hitherto largely neglected subject, reinforced 
with copious statistics and illustrated with 
some appealing historical and contempo- 
rary images. It is enlivened by apt quota- 
tions from novels and films of the period, 
although regrettably includes none from the 
films of India’s greatest director, Satyajit Ray. 
Ray’s works offer many subtle reflections on 
people and technology, not least the trains, 
small-scale machinery and office atmos- 
phere depicted in his celebrated Apu Trilogy. 

However, the parts of this book are greater 
than the sum. The author’s thesis is abun- 
dantly proven, but his conclusions seldom 
surprise. I am also left with the uncomfort- 
able feeling that for all the enthusiasm with 
which modern India has responded to for- 
eign technology, it has yet to create anything 
comparable with the achievements of its 
pre-colonial mathematicians, scientists and 
technologists. = 


Andrew Robinson is a writer and the 
editor of Exceptional Creativity in Science 
and Technology. He has written nine books 
on India, and the forthcoming India: A 
Short History. 

e-mail: andrew.robinson33@virgin.net 
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Primatological derring-do 


Kelly Stewart revels in a graphic biography that follows the human and scientific 
stories of three iconic primate researchers. 


s enthralling careers go, those of 
Ata primatologists Jane 

Goodall, Biruté Galdikas and the 
late Dian Fossey take some beating. Now 
the personal and scientific stories of these 
pioneers of research on, respectively, wild 
chimpanzees, orangutans and gorillas fea- 
ture in Primates, an engaging graphic biog- 
raphy by Jim Ottaviani and illustrator Maris 
Wicks. Unifying the three intimate first- 
person narratives is the figure of renowned 
palaeontologist Louis Leakey, who helped to 
launch all three researchers’ careers. 

Ottaviani clearly carried out extensive 
research on published material by and 
about the trio of researchers, including 
diaries and letters. For a book that takes well 
under an hour to read and aims to engage 
teens, Primates offers a remarkable amount 
of information on many different levels. The 
life stories may be rendered as cartoons, but 
the characters come across as multi-dimen- 
sional. And there is plenty of human interest, 
from Fossey’s uncompromising ferocity and 
mercurial personality to Galdikas’s painful 
choice between returning to Canada with 
her husband and young son or remaining 
in Indonesia to continue her work with 
orangutans. 

How true to life is it? As someone famil- 
iar with all three stories — especially that 
of Fossey, with whom I studied gorillas in 
Rwanda — Id say it’s an accurate rendition. 


We learn about the logistics of fieldwork, 
which involves no shortage of discomforts 
to satisfy the gruesome fascination of young 
readers — days of being rained on, isolation, 
exhaustion and illness. Galdikas’s story, set in 
the leech-infested Indonesian forest, is espe- 
cially rich in the ‘ick factor. The characters cut 
trails, sift through dung for food remnants, 
spend months living alongside and observ- 
ing their subjects, present their findings at a 
conference, suffer academic insecurity and 
social awkwardness, and struggle to balance 
anthropomorphism and objectivity. 

Making it all worthwhile is the fascinat- 
ing allure of living in the wild and becom- 
ing immersed in the 
lives of members of 
another species. Tri- 
umph also comes 
with discoveries, such 
as Goodall making 
the first observation 
of chimpanzees using 
tools. And of course, 
Ottaviani describes 
the inevitable fight to 
conserve the apes and 
their habitats, which 
eventually becomes 


Primates: The 
Fearless Science of 
Jane Goodall, Dian 
Fossey, and Biruté 
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confusing shift in storyline or viewpoint, but 
is generally pithy and fast-paced. Although 
Wicks’s artwork does not dazzle, it gets the 
point across, making ample use of expres- 
sions of comic exaggeration such as bug- 
eyed surprise. She also applies comic touches 
that convey the vitality of forests and their 
denizens, with ape vocalizations that change 
colour and burst out of the cartoon frame. 
Many illustrations are clearly modelled on 
specific photographs from early National 
Geographic articles. 

The best thing about Primates is that 
Leakey and the primatologists essentially 
become action heroes. They are unconven- 
tional but undaunted, persevering against 
the odds. They make sacrifices, lead daring 
lives, uncover mysteries and fight for the 
good; science and scientists are portrayed 
as cool and exciting. 

This book won't teach kids much about 
the great apes, but that isn’t its point. If it 
inspires young readers to explore the reading 
list provided at the end, and perhaps become 
scientists or conservationists, then — as one 
might say at the end of any action comic — 
‘mission accomplished. = 


Kelly Stewart is a research associate in the 
Department of Anthropology, University 
of California, Davis. She is co-author of 
Gorilla Society with Sandy Harcourt. 
e-mail: kjstewart30@gmail.com 
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Save Caatinga from 
drought disaster 


Brazil’s semi-arid Caatinga 

scrub forest is experiencing its 
worst drought in 30 years, with 
more than 300 settlements in the 
northeast at the point of collapse 
(see go.nature.com/pngjfq). The 
federal government must urgently 
address the drought’s disastrous 
effects on livelihoods and on the 
survival of this biosphere reserve. 

Whether natural fluctuations 
in temperature and rainfall or 
climate change are to blame, the 
lack of water is killing livestock 
and destroying crops. Pressure 
on Caatinga land is increasing as 
local people hunt wild animals 
for food and trade, often burning 
vast areas of vegetation to 
flush out their prey. We have 
seen native plants being used 
indiscriminately to fuel furnaces 
for brick production so that 
families can buy food and water 
from other regions of Brazil. 

The plight of this stricken area 
is being largely ignored by the 
media and by the government. 
Human survival should no 
longer need to depend on the 
destruction of local biodiversity. 
Roberto Leonan Morim 
Novaes, Saulo Felix Federal 
University of the State of Rio de 
Janeiro, Brazil. 
roberto_leonan@yahoo.com.br 
Renan de Franga Souza State 
University of Rio de Janeiro, Brazil. 


In praise of open 
research measures 


On behalf of the Data-Enabled 
Life Sciences Alliance (DELSA 
Global), we applaud the 
significant, timely steps Nature is 
taking to ensure reproducibility 
and transparency in life-sciences 
articles (Nature 496, 398; 2013 
and go.nature.com/oloeip). 

We discussed Nature's 
Reporting Checklist for Life 
Sciences at our annual workshop 
last month (see www.delsaglobal. 
org). By encouraging researchers 
to make their data and metadata 
available, and to clarify their 


analysis methods, the checklist 
will help to prevent mistakes 
from being propagated and 
resources from being wasted on 
dead-end experiments. 

This is important in an era of 
tight funding and limited training 
in the quantitative aspects of 
research, both of which inhibit 
confirmatory experimentation. 
In addressing the veracity of 
data as well as the reliability 
and reproducibility of research, 
Nature’s checklist will stimulate 
the transformation of data into 
knowledge, action and outcomes. 

Scientific advances need 
strong public support to make 
a difference, and your policies 
constitute an important step in 
preserving public trust in science. 
The checklist can act as a useful 
template for development by 
publishers, federal agencies, 
funders, research organizations, 
societies and communities. 
Eugene Kolker* Seattle Children’ 
Research Institute, Seattle, 
Washington, USA. 
eugene.kolker@seattlechildrens.org 
*On behalf of 21 co-signatories. See 
go.nature.com/6mypmw for full list. 
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Curb China’s rising 
food wastage 


China is currently managing to 
feed its people (FE. Zhang et al. 
Nature 497, 33-35; 2013), but 
food loss and waste throughout 
the supply chain must be taken 
into account if food security is to 
be maintained in the future. 

Of China’s grain output, an 
estimated 8%, 2.6% and 3% are 
lost during storage, processing 
and distribution, respectively 
—a total of some 35 million 
tonnes annually. As in many 
other developing countries, 
these alarming losses are a result 
of inadequate infrastructure, 
knowledge and technology, and 
are exacerbated by a decentralized 
agricultural production system. 

China’s increasing affluence is 
also leading to wide-scale food 
wastage. For example, household 
food waste totals roughly 2.5% of 
grain a year (around 5.5 million 
tonnes). This is fast approaching 
Western levels (see, for instance, 
go.nature.com/erz4if). 

The pattern and scale of food 
waste in China are still unclear: 
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more quantitative research will 
help to inform policy-making 
and to increase public awareness 
of the problem (see our pilot 
study at go.nature.com/8zq1je; in 
Chinese). 

Gang Liu Norwegian University 
of Science and Technology, 
Trondheim, Norway. 
gang.liu@ntnu.no 

Xiaojie Liu, Shengkui Cheng 
Institute of Geographical Sciences 
and Natural Resources Research, 
Chinese Academy of Sciences, 
Beijing, China. 


Education: enticing 
students into science 


Colin Macilwain argues that the 
United States would not need 

to spend US$3 billion annually 
ona programme to encourage 
young people to pursue careers in 
science, technology, engineering 
and mathematics (STEM) if 
market forces were right (Nature 
497, 289; 2013). But conditions 
and opportunities in science have 
not visibly improved for students 
in the past three decades. 


Twenty-four years ago, the US 
government was also trying to 
attract young people into STEM 
(T. Packard Eos 70, 709; 1989). 
Then, 66% of the ocean-science 
community was living hand-to- 
mouth on short-term government 
grants. A university professor 
was expected to do 40 hours of 
teaching and administration a 
week, and 40 hours of research. 
Researchers who did not receive 
funding from their universities 
dared not spend time away from 
their work, lest their publication 
record should drop. 

It was clear to me at the 
time that if research centres, 
universities, governments and 
societies wanted more people to 
work in science and technology, 
then salaries, job stability and job 
security would have to improve. 
They still have not. 

Young people continue to 
shun research and instead opt 
to use their mathematical skills 
in accounting, their analytical 
skills in investment banking 
and their love of science in 
medicine. Macilwain blames 
business for the woeful range 
of scientific opportunities 
available to graduates. Whether 
the fault lies with business, 
government or universities, the 
educational pipeline in science 
and engineering does not work 
because graduates are scared off 
by what they see as a meat grinder 
at the other end. 

Theodore T. Packard University 
of Las Palmas de Gran Canaria, 
Spain. 

theodore. packard@ulpgc.es 


Education: science 
literacy benefits all 


Colin Macilwain wields too wide 
a brush in painting US federal 
funding of STEM education (for 
promoting ‘science, technology, 
engineering and mathematics’) 
as having the sole purpose of 
bolstering the workforce (Nature 
497, 289; 2013). This funding also 
achieves general science literacy, 
particularly when it is directed 
towards children in primary 
and secondary education or 
undergraduate students. 

No matter how far they are 
pushed, most teens and young 
adults will not become scientists. 


Fortunately, many STEM 
programmes familiarize students 
with the scientific process and 
with the natural world. Learning 
fundamental concepts also 
teaches them how to interpret and 
handle scientific information. 
Science literacy subsequently 
benefits individuals throughout 
their lives, from forming 
opinions about proposed 
government policies to making 
health-care decisions. A well- 
informed citizenry, in turn, pays 
dividends to society as a whole. 
Aaron C. Hartmann University 
of California, San Diego, USA. 
achartma@ucsd.edu 


Climate and war: a 
call for more research 


The possibility that climate 
change could be responsible for 
violent conflict (A. Solow Nature 
497, 179-180; 2013) is starting 
to influence how governments 
frame and react to climate 
change. However, a real problem 
in this area is a paucity of theory 
to explain the associations (if 
any) between climate change and 
the outbreak of violence. 

One overlooked factor is that 
populations caught up in conflicts 
or living in post-conflict societies 
are often more vulnerable to 
climate change. For example, the 
presence of landmines makes 
productive land inaccessible. 

Climate policies can 
themselves be a source of conflict 
(see go.nature.com/zutmox). 
Measures that manage carbon 
sources and sinks or treat them 
as commodities — such as 
land-use changes, hydropower 
development or initiatives 
to reduce emissions from 
deforestation — can stimulate 
civil unrest if implemented 
without adequate checks. 

Poverty, a history of fighting, 
and weak governance are 
well-established risk factors 
for conflict. The likelihood of 
violent conflict is reduced by 
democracy, social protection, 
effective justice systems and the 
protection of property rights. 
The influence of climate change 
on these factors warrants further 
investigation to guide policy- 
makers in promoting peace and 
prosperity in a changing climate. 
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Neil Adger University of Exeter, UK. 
n.adger@exeter.ac.uk 

Jon Barnett University of 
Melbourne, Victoria, Australia. 
Geoff Dabelko Ohio University, 
Athens, USA. 


Climate and war: no 
clear-cut schism 


We are sceptical about the 
effectiveness of Andrew Solow’s 
proposals for cooling the debate 
over a possible link between wars 
and climate change (Nature 497, 
179-180; 2013). We think that the 
division between the two sides 
(quants’ versus ‘quals’) is not as 
clear-cut as he implies. 

Solow argues that this dividing 
line distinguishes between those 
who search for connections 
between violence and natural 
phenomena, including climate- 
related factors (quants), and those 
who prefer to explain conflicts as 
social processes (quals). But both 
approaches are studied by quants 
as well as quals. Quants may study 
climate-related effects on conflicts 
by analysing single events in detail 
or by considering many wars on 
aggregate using statistics. 

There are also strong 
disagreements among those on 
each side of Solow’s dividing line. 
For example, quants as well as 
quals include both proponents 
and sceptics of the connections 
between climate change and 
violent conflict. 

In our view, the true divide is 
not so much about substance as 
about perspective. 

Michael Brzoska, Jiirgen 
Scheffran University of 
Hamburg, Germany. 
brzoska@ifsh.de 


Gender equality in 
Australian academies 


Women are not under- 
represented across all learned 
academies in Australia (see 
Nature 497, 7 and Nature 
497, 439; 2013). For example, 
the Australian Academy of 
Technological Sciences and 
Engineering (ATSE; of which I 
am president) has taken steps 
to ensure that women are 
appropriately recognized and 
included in all its activities. 


Gender imbalance can 
adversely affect all stages 
ofa scientific career, from 
tertiary education to employer 
recruitment, retention and 
promotion, with implications 
for a country’s productivity and 
prosperity. Over the past three 
years, ATSE has led the way in 
identifying and promoting female 
talent across the science and 
technology sector in Australia, 
and within the academy itself. 

One key element of ATSE’s 
gender-equality policy is to 
identify women candidates for 
fellowship nomination through 
active search and mentoring 
processes. Last year, 10 of 37 
elected fellows were female, and 
women now comprise 40% of 
ATSE’s governing board. 
Alan Finkel ATSE, Toorak, 
Australia. 
alan@finkel.net 


European concerns 
over GM salmon 


As investigators for the European 
Food Safety Authority into the 
environmental risks posed by 
genetically modified (GM) fish, 
we are concerned about the US 
Food and Drug Administration's 
imminent approval of GM salmon 
(Nature 497, 17-18; 2013). 

This is a huge step that could 
encourage aquaculture of other 
GM fish in other countries, and 
not necessarily under strictly 
biosecure conditions. 

There is still considerable 
uncertainty surrounding the 
environmental and physiological 
effects of escaped, fast-growing 
GM fish on aquatic systems. This 
reflects a poor understanding of 
how different species might be 
affected as the modified gene is 
expressed in the wild. 

Europe's regulatory guidelines 
for aquaculture of GM fish and 
other alien species in Europe 
will therefore be underpinned 
by rigorous risk assessment (see 
go.nature.com/p6x2qb). 

J. Robert Britton Bournemouth 
University, Poole, UK. 
Rodolphe E. Gozlan 
Bournemouth University, Poole, 
UK; and Institut de Recherche 
pour le Développement (UMR 
207), Paris, France. 
rudy.gozlan@ird.fr 
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NEWS & VIEWS 


Cloning human embryos 


Human embryonic stem cells have at last been generated by a technique called somatic-cell nuclear transfer. Further 
research on such cells should provide insight into ways of improving the generation of stem cells by reprogramming. 


CHRISTINE L. MUMMERY & 
BERNARD A. J. ROELEN 


r | he birth of Dolly the sheep in 1996 
produced great excitement among 
researchers. This first cloned mammal 

had been created! by introducing the nucleus of 

a somatic (non-germ) cell into an egg cell from 

which the genomic DNA had been removed, 

and transferring the resulting embryo into a 

foster mother. One implication of this achieve- 

ment was that similarly cloned embryos could 
be used to produce stem cells that would be 
genetically identical to the cells of the somatic- 
cell donor, so that if these stem cells, or cells or 
tissues derived from them, were transplanted 
into the donor for treatment purposes they 
would not be rejected by the donor’s immune 
system. In the years that followed, this tech- 
nique of somatic-cell nuclear transfer (SCNT) 
was successfully used to produce stem cells 
from cloned mouse embryo s°, but all attempts 
in humans had failed — until the publication of 

a study in Cell by Tachibana et al.*. 

The study is noteworthy for several reasons. 
First, it explains why all previous attempts at 
cloning human embryos have failed. The 
human egg (oocyte), and that of most mam- 
mals, is released from the ovary at the meta- 
phase II stage of meiotic cell division. The cell 
resumes and completes meiosis only after it is 
fertilized. Removal of the meiotic spindle — 
the cellular structure that ensures the faith- 
ful distribution of chromosomes between 
dividing cells — is an integral part of SCNT. 
Tachibana et al. realized that this induces 
premature completion of meiosis in human 
eggs and subsequent loss of their capacity to 
reprogram somatic cells to a pluripotent state, 
which would allow them to differentiate into 
all cell types in the body. Crucially, the addi- 
tion of caffeine to the culture medium slowed 
meiotic completion, ensuring the success of 
the authors’ procedure. 

The paper also shows that blastocysts 
(roughly 100-cell embryos) derived using 
this modified SCNT protocol were healthy 
enough to be used for generating embryonic 
stem (ES) cells that were genetically identical 
to the donor nucleus. And, notably, the authors 
have managed to generate these SCNT-ES cells 
using nuclei not only from fetal cells but also 
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Figure 1 | Generation of pluripotent stem cells in vitro. a, Tachibana et al.’ show that human 
embryonic stem (ES) cells can be generated by a technique called somatic-cell nuclear transfer (SCNT). 
The authors removed the meiotic spindle from an oocyte arrested at the metaphase II stage of meiotic cell 
division. The oocyte had been incubated with caffeine to prevent premature completion of meiosis. They 
then inserted a somatic cell into the enucleated oocyte. Oocyte activation and cellular reprogramming 
followed, leading to blastocysts from which SCNT-ES cell lines were derived. b, By comparison, the 
generation of induced pluripotent stem (iPS) cells involves the introduction of four pluripotency-related 
transcription factors into a differentiated cell to induce its direct reprogramming. 


from post-natal cells. This approach could 
therefore be used to create cellular models of 
the genetic disease that a somatic-cell donor 
might carry. 

These technical achievements, for which 
researchers worldwide have strived for at least 
a decade, should be celebrated. Nonetheless, 
parallel advances made in the field of stem-cell 
research somewhat dampen the excitement 
that the present paper might have received — 
the sort of excitement that was generated some 
nine years ago by Woo Suk Hwang’s report of 
similar results, before it was discovered that 
those data had been fabricated’. 

In our opinion, the discovery® in 2006 
that differentiated adult cells can be directly 
reprogrammed to a stem-cell-like state called 
induced pluripotent stem (iPS) cells was a 
more significant breakthrough for this research 
field. iPS cells can be generated by introducing 
just four transcription factors into differenti- 
ated cells of an individual, without the need for 
the ethically sensitive step of creating embryos 
from oocytes as intermediates (Fig. 1). Indeed, 
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many laboratories now routinely generate iPS 
cells from patients, bypassing the practical and 
regulatory difficulties associated with obtain- 
ing human oocytes. 

But an intriguing question now is how simi- 
lar human iPS and SCNT-ES cells are. One dif- 
ference is immediately apparent. In iPS cells, 
mitochondria (organelles that are the main 
source of cellular energy), as well as all other 
organelles, originate from the donor cell. In 
SCNT-ES cells, the mitochondria are derived 
from the oocyte and not from the donor of the 
nucleus. Apart from the nucleus, mitochon- 
dria are the only organelles that contain DNA, 
which encodes around ten genes. This means 
that SCNT-ES cells might activate the immune 
system of an individual who is ostensibly being 
treated with their ‘own SCNT-ES cells and 
cause them to be rejected. 

On the other hand, it makes these cells suita- 
ble for studying mitochondrial diseases, which 
are maternally inherited. However, to create 
their SCNT-ES cells, Tachibana et al. used 
nuclei from the cells of a patient with Leigh 


syndrome — a disorder that can be caused by 
mutations in a mitochondrial gene. Because 
mitochondria in SCNT-ES cells originate from 
the oocyte, these cells would not carry the 
same mutation, nor model Leigh syndrome. 
Nonetheless, the cells generated are a proof of 
principle that adult somatic cells can be used 
in human SCNT. 

The present study shows that the authors’ 
SCNT-ES cells meet the main criteria for pluri- 
potency: they can differentiate in vitro; they 
express pluripotency genes; and, when injected 
into an immune-deficient mouse, they form 
a teratoma — a type of tumour that contains 
many different cell types. Nevertheless, other 
properties of these cells were not extensively 
explored. 

For instance, although human iPS cells are 
known to accumulate mutations without the 
necessary care’, overall they are very simi- 
lar to ES cells derived from normal ‘surplus’ 
human embryos obtained by in vitro fertiliza- 
tion (IVF) treatment, and they are genetically 
and epigenetically stable under careful culture 
conditions over long periods’. Tachibana et al., 
however, did not compare the efficiency of 
SCNT-ES cells, human IVF ES cells and iPS 
cells at differentiating in vitro under optimal 
conditions. Similarly, it is unclear whether 
SCNT-ES cells remain stable over time. Fur- 
ther investigation along these lines would be 
beneficial. 

What this study provides is an excellent 
source of reference. Direct reprogramming of 
human iPS cells takes several weeks, whereas 
SCNT-ES cells are reprogrammed within a 
few hours by the natural factors present in the 
oocyte, and could in principle give rise to new 
offspring. A head-to-head comparison of these 
cell types over a long culture period would be 
ideal, not least to identify factors that might 
improve the efficiency and yield of direct 
reprogramming. 

A cautionary note: since the paper’s pub- 
lication, there has been ongoing discussion 
about some errors, such as possible figure 
duplication and mislabelling, that it contains’. 
We therefore eagerly await experimental 
confirmation of Tachibana and co-authors’ 
results by others, as well as the outcome of an 
investigation by Cell to determine how such 
errors occurred and whether they affect the 
study’s overall conclusions (as Nature went 
to press, the results of this investigation had 
not been released). These concerns notwith- 
standing, the present findings are a major 
development, particularly for those studying 
human reproduction and IVF. Whether it is 
a game changer for research into understand- 
ing disease, regenerative medicine and drug 
discovery is debatable. m SEE COMMENT P.159 
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the atomic way 


Ultracold atomic gases are excellent platforms for exploring phenomena in 
condensed-matter physics. They have now been used to engineer the spin Hall 
effect and to make the atomic counterpart of the spin transistor. SEE LETTER P.201 


PETER VAN DER STRATEN 


he spin of elementary particles is a con- 

cept in quantum mechanics that has 

no counterpart in the classical world. 
Although the electron’s spin was inferred 
from the 1922 Stern—Gerlach experiment on 
the deflection of particles, it has only recently 
been exploited in electronic devices — con- 
ventional electronic circuitry is based on the 
electron’s charge. Atoms also carry a spin, and 
so could similarly be exploited for spin-based 
electronics. On page 201 of this issue, Beeler 
et al.' demonstrate how ultracold atoms can be 
used to build the atomic analogue of an elec- 
tronic switch that was proposed more than 
20 years ago: the spin transistor’. 

Spin-based electronics, or spintronics, 
has seen tremendous developments in the 
past decade’. At the heart of this field is the 
manipulation of the electron’s spin, which in 
condensed-matter materials can be rather 
complicated owing to the interaction of this 
spin with its surroundings and the limited 


controllability of the materials. In this regard, 
an effect called spin-orbit coupling, which 
allows the manipulation of the electron’s spin 
without the use of local magnetic fields, has 
been instrumental in simplifying the construc- 
tion of spintronic devices. 

During the past ten years, ultracold atomic 
gases have become the ideal playground in 
which to investigate fundamental phenom- 
ena in many fields of physics, particularly 
condensed-matter physics. The excellent 
manipulation of both the internal and the 
external degrees of freedom of ultracold atoms 
allows the study of complex physics in a con- 
trolled way. Atoms have the advantage that 
they carry spin but have no charge. Therefore, 
charge effects that would otherwise need to 
be considered can be excluded. Furthermore, 
atoms can have either a fermionic or a bos- 
onic particle character (they have half-integer 
or integer spin, respectively), so the effect of 
different quantum-particle statistics on spin 
interactions can readily be tested. Finally, 
ultracold atomic gases confer more versatility 


Figure 1 | Going separate ways. Beeler et al.' have demonstrated that two laser beams (purple arrows) 
can be directed at a cloud of atoms to generate an artificial magnetic field (not shown) that makes atoms 
of two spin orientations (red and blue) move in opposite directions. 
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on spin manipulation and detection than do 
electrons, yielding new ways to explore the 
richness of spintronics. 

To make spintronic devices based on 
atomic spins, spin-orbit coupling needs to be 
addressed. In atomic physics, the effect cou- 
ples the electron’s spin with its orbital angular 
momentum through the electrical field of the 
nucleus, and leads to the fine structure (small 
splitting) of the internal states of the atom. In 
condensed-matter systems, however, the cou- 
pling is between the electron’s spin and its lin- 
ear motion, and is caused by the electrical field 
of the underlying atomic lattice. Researchers 
have recently proposed ways to create artificial 
electromagnetic fields (gauge fields) to induce 
such spin—motion interaction in atomic sys- 
tems. These fields are produced by coupling 
internal states of atoms with laser beams at 
ultra-low temperatures. In their experiment, 
Beeler et al. used one such field to engineer 
spin—motion interaction in an atomic system 
and to observe a quantum effect known as the 
spin Hall effect. 

The spin Hall effect is similar to the con- 
ventional Hall effect, in which the positively 
and negatively charged particles of an electri- 
cal conductor are separated by a transverse 
magnetic field, producing a voltage at a right 
angle to the current. In the spin Hall effect, 
the particles go their separate ways accord- 
ing to whether their spin points in one of two 
opposing directions. By using their gauge 
field, which was created by means of two laser 
beams, Beeler et al. convincingly show that 
atoms with opposite spin states that travel at 
right angles to the magnetic field produced 
by the gauge field move in opposite directions 
(Fig. 1). The researchers go on to show that 
their data agree well with theoretical calcula- 
tions, indicating that the authors have a proper 
understanding of the mechanism behind the 
spin Hall effect in their system. 

But Beeler and colleagues have taken their 
work one step further: they realized the atomic 
analogue ofa spin transistor. Although previ- 
ously proposed’ in 1990, the spin transistor 
has been made only within the past few years’. 
By using the displacement of the atoms in the 
gauge field as the transistor’s voltage differ- 
ence between the drain and source electrodes, 
and the strength of the two laser beams as the 
transistor’s voltage of the gate electrode, Beeler 
et al. realized an atomic system that shows the 
characteristics of a typical spin transistor. The 
simplicity and robustness of the authors’ tran- 
sistor also makes it a good option for splitting 
atoms according to their spin in a device known 
as a Mach-Zehnder interferometric sensor. 

Beeler and colleagues’ experiment opens 
up many avenues in the field of ultracold 
atomic gases. The gauge field produced is of 
the ‘Abeliar’ type; however, there are propos- 
als to generate non-Abelian gauge fields’. 
These non-Abelian fields are more difficult 
to realize experimentally but allow a closer 
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comparison with condensed matter, in which 
non-Abelian fields are usually responsible 
for the spin-motion interaction. Beeler et al. 
induce the spin Hall effect in their system 
by using spin—motion coupling, but in con- 
densed matter the effect can also be induced by 
scattering of electrons by impurities. Although 
ultracold atomic gases are free of impurities, 
interactions between the atoms can be tuned 
to be made strong and yield exotic phenomena 
such as superfluidity. 

Spin-orbit interactions can lead to topo- 
logical insulators, which are insulating in their 
bulk but have topologically protected conduct- 
ing states on their boundaries. Such states can 
easily be controlled and detected in ultracold 
atomic gases. The crossroads between ultra- 
cold atomic gases and condensed-matter 
physics provide fertile ground for research: 


STEM CELLS 


the former focuses on fundamental knowledge 
obtained through the study of well-character- 
ized systems under controllable conditions, 
whereas the latter applies such knowledge in 
information technologies. Many new phenom- 
ena can be expected to surface in these areas in 
the next few years. m 
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apt 


Regulation by 
alternative splicing 


Stem-cell differentiation is controlled by RNA processing — as well as by gene 
expression and transcription. This finding is a milestone towards realizing these 
cells’ potential for research and therapy. SEE LETTER P.241 


YAIR AARONSON & ERAN MESHORER 


ammalian genomes contain some 
Meee genes. Yet the process of 

alternative splicing ensures that the 
number of proteins arising from these genes 
is at least ten times greater’. It achieves pro- 
tein diversity by varying the way in which the 
RNA transcript of a gene is processed: each of 
the protein-coding sections of a transcript can 
be either spliced out or left in to form differ- 
ent mature messenger RNAs. Consequently, 
multiple variants of a protein (isoforms) can 
be produced from a single gene, in a tissue- 
specific or developmental-stage-specific 
manner. On page 241 of this issue, Han et al.’ 
describe the role of alternative splicing in the 
regulation of embryonic stem cells, thereby 
adding another notable regulatory layer to the 
known mechanisms that govern stem-cell state 
and differentiation’. 

Embryonic stem (ES) cells have two special 
qualities: they can undergo an unlimited num- 
ber of divisions, and they are pluripotent — 
that is, they can differentiate into any cell type 
of a mature organism. These cells, therefore, 
have great potential for clinical use and can 
serve as models for studying disease. Research 
into how ES-cell pluripotency is regulated has 


*This article and the paper under discussion’ were 
published online on 5 June 2013. 
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mainly focused on the control of gene expres- 
sion through modifications of chromatin’ (the 
complex of DNA and proteins in chromo- 
somes) and during transcription’. But, despite 
several master regulators of alternative splicing 
having been identified**, none has been impli- 
cated in ES-cell maintenance, differentiation or 
transcription (Fig. 1). 

To screen for alternative-splicing events 
associated with pluripotency, Han et al. studied 
RNA data from pluripotent cells and various 
differentiated cells from humans and mice. The 
pluripotent cells they investigated included 
not only ES cells but also induced pluri- 
potent stem (iPS) cells, which are ES-like cells 
derived through molecular reprogramming of 
differentiated cells’. 

The authors identified dozens of alterna- 
tive-splicing events that differed between 
pluripotent and differentiated cells, including 
a previously known* ES-cell-specific event in 
the mRNA for the pluripotency factor FOXP1. 
And when they measured the expression levels 
of many known splicing regulators, the authors 
found a few that differed significantly between 
pluripotent cells and differentiated cells. In 
particular, two of the regulators — MBNL1 
and MBNL2 — showed very low expression 
in ES cells and much higher expression in 
differentiated cells. 

How do MBNL1 and MBNL2 affect stem- 
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Figure 1 | Multilayer regulation of cell differentiation. a, The nature and amounts of chromatin- 
modifying and -remodelling proteins (blue and yellow) that bind to DNA and its associated histone 
proteins differ between stem cells and differentiated cells, affecting gene expression through regulation 
at the level of chromatin. b, At the level of transcription, specific transcription factors orchestrate the 
distinct transcript output of stem cells compared with differentiated cells. c, Han et al.’ show that at 

the level of transcript processing, regulators of alternative splicing, such as MBNL proteins, govern the 
differences in mRNA, and thus protein, output between stem cells and differentiated cells. 


cell identity? The researchers report that sites 
of alternative splicing in ES-cell transcripts 
are highly enriched in MBNL1- and MBNI2- 
binding motifs, and that these factors spe- 
cifically bind to the sites in a unique pattern. 
So it seems that the binding patterns of 
these regulators control the omission or inclu- 
sion of protein-coding regions (exons) in the 
mature mRNA. 

The cellular levels of MBNL proteins 
also seem to affect the differentiation state. 
Increased expression of these proteins in ES 
cells induced differentiation-specific alterna- 
tive-splicing events, and decreased the levels 
ofan ES-cell-specific isoform of FOXP1. Con- 
sistently, reducing expression of these proteins 
in differentiated cells led to a switch of the 
alternative-splicing program to an ES-cell-like 
pattern. And the efficiency of reprogramming 
of differentiated cells into iPS cells was greatly 
enhanced with reduced expression of MBNL1 
and MBNL2 (the splicing pattern associated 
with ‘stemness’ was particularly prominent in 
cells that were successfully sustained through 
the later parts of the reprogramming process). 

Han and co-workers’ paper sets the stage for 
extensive follow-up studies. Understanding 
the exact mechanism of action of the MBNL 
proteins might help to identify upstream 
elements of this regulatory network. More- 
over, the epigenetic state of ES cells — that is, 
genomic modifications that affect gene expres- 
sion without changing the DNA sequence — 
is subject to continuous regulation, and a link 
between epigenetics and alternative splicing 
has been proposed”"’. Understanding how 


alternative splicing interacts with epigenetic 
and other networks that are known to regulate 
pluripotency would be fascinating. Further- 
more, Han et al. identified many more sites of 
alternative splicing, and differential regulators 
of splicing in ES cells that they could not 
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investigate in the current work. These should 
be studied, as they might provide additional 
insights into the mechanism by which alterna- 
tive splicing controls pluripotency. 

The authors’ observations might also have a 
notable practical implication. Splicing regula- 
tors could potentially be harnessed to control 
the efficiency and outcome of cellular differ- 
entiation and reprogramming — akin to the 
use of transcription factors for these purposes. 
While we tune in for follow-up studies, Han 
and colleagues’ findings will surely change 
the ways in which researchers examine and 
manipulate pluripotent cells. = 
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The vector as protector 


Malaria infections are not always lethal. One reason for this may be that 
transmission from mosquitoes creates malaria parasites that trigger a more 
protective mammalian immune response. SEE LETTER P.228 


ANDREW F. READ & NICOLE MIDEO 


alaria parasites can kill people, 
but death is not inevitable. Most 
infected individuals recover, 


some after experiencing relatively mild 
symptoms or none at all. What accounts 
for this variability? Host factors such as the 
expression of sickle-cell genes or acquired 
immunity are part of the explanation. But 
it is also well known that malaria parasites 
themselves can be more or less nasty’. In 
this issue, Spence et al.’ (page 228) report a 
set of clever experiments in a mouse model 
of malaria infection that shows that the 
conditions experienced by parasites before 
they reach the mammalian bloodstream can 
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determine just how virulent they are’. 
Malaria parasites transmitted to people by 
mosquitoes migrate to the liver, where they 
replicate before entering the bloodstream. For 
convenience, and because only blood-stage 
parasites cause disease, most experimental 
studies of malaria in humans and animals 
bypass the mosquito and liver stages and inject 
parasites directly into the bloodstream. Using 
the malaria parasite Plasmodium chabaudi, 
which infects rodents, Spence and colleagues 
compared the blood-stage infections gen- 
erated by this method with those initiated 
naturally, by mosquito bite. They found that, 
compared with directly injected parasites, 


*This article and the paper under discussion® were 
published online on 29 May 2013. 
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mosquito-transmitted parasites 
replicated less well once in the blood- 
stream and generated lower-grade 
infections that persisted for longer. 
Moreover, these parasites did not 
induce the severe weight loss, hypo- 
thermia and liver damage caused by 
parasites injected directly into the 
bloodstream. 

Why these differences? An impor- 
tant clue came from the authors’ 
finding that, in immunodeficient 
mice, parasites transmitted by mos- 
quitoes grew just as well as those 
injected directly. This suggested that 
there is nothing intrinsically attenu- 
ated about parasites derived from 
mosquitoes. Spence et al. show that 
mosquito-transmitted parasites elicit 
a qualitatively different immune 
response in the mouse — one that 
better controls parasite replication 
and relies less on the inflammatory 
signalling molecules that are asso- 
ciated with severe disease. To try 
to explain this difference, Spence et 
al. conducted a genome-wide RNA 
analysis and found that mosquito 
transmission modifies the expres- 
sion of about 10% of the genome 
of blood-stage parasites. Intrigu- 
ingly, expression was most intensely 
regulated for gene families encoding 
antigenic proteins, against which the 
host’s immune system mounts its response. 
The hypothesis, then, is that mosquito trans- 
mission alters subsequent antigen expression 
when the parasites are in the bloodstream, 
and that the induced gene-expression pattern 
elicits an immune response that more effec- 
tively contains the parasites with less collateral 
damage to the host. 

It seems that it is the environment experi- 
enced by the parasites during natural transmis- 
sion that triggers this ‘attenuated phenotype’ 
That environment could be inside the mos- 
quito itself, or it could be something experi- 
enced by the parasite in the skin soon after 
injection, during its journey to the liver or in 
the liver. Intriguingly, Spence et al. show that 
the attenuated phenotype also occurs in mice 
injected with blood-stage parasites isolated 
from other mice with mosquito-initiated 
infections. Thus, the phenotype is stable for 
several cycles of blood-stage parasite replica- 
tion, although it does gradually decay over 
subsequent rounds of injecting these parasites 
into new hosts. It will be interesting to deter- 
mine whether profiles of the host immune 
response and of parasite-antigen expres- 
sion associated with attenuation decay in a 
similar manner. 

Does this discovery mean that all future 
experimental malaria infections should be 
initiated by mosquitoes? There is no way to 
include mosquito transmission in in vitro 
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Figure 1 | Evolutionary selection of antigenic profiles. If variation in 
the antigens expressed by a parasite gives rise to qualitatively different 
infection dynamics and outcomes, natural selection might favour the 
expression of different antigenic profiles at different times or in different 
regions. For instance, to survive a prolonged dry season, when little to no 
transmission occurs, parasites with the attenuated phenotype described 
by Spence et al.’ — causing chronic infections of low virulence — and 
the associated antigenic profile may be most successful. By contrast, 
when the rainy season begins and epidemic situations arise, parasites 
with antigen-expression profiles that result in rapid proliferation 

and transmission may be favoured. In this case, the cost of shorter 
infectious periods associated with rapid clearance of the parasite by the 
immune system, or host death, may be offset by the advantages of faster 
transmission to new hosts. These evolutionary forces might generate 
parasites that respond to cues associated with transmission (through 
altered gene expression) in some regions, and parasites that do not in 
others, such as in endemic areas where transmission occurs year-round. 


studies of the most lethal human malaria 
parasite, Plasmodium falciparum. Immuno- 
suppressed mice with human-cell transplants 
can support P. falciparum infections’, but it is 
unclear whether the addition of one aspect 
of biological reality (mosquito transmission) 
will make up for the loss of another (the use of 
human parasites in mice). Mosquito infections 
are an option in animal models, from which 
much has already been learned by injecting 
blood-stage parasites. For example, experi- 
ments with P. chabaudi have shown that a 
powerful contributor to the severity of malaria 
can be the host immune response itself >and 
that competition between different parasite 
strains can be a potent force shaping the evo- 
lution of drug resistance®. The key question 
is not whether these phenomena still occur if 
more of the parasite life cycle is incorporated 
into the experimental work, but whether they 
occur in nature. 

The effects of mosquito transmission on 
host immune response and parasite anti- 
gen expression observed by Spence and col- 
leagues might be the independent outcomes 
of environmental influences, or they might 
be causally connected. If the latter is true, the 
question remains whether immunity trig- 
gers the antigenic profile or whether altered 
antigen expression triggers a more protective 
immune response. The direction of this causal- 
ity could have implications for vaccine design. 
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In the first scenario, considering 
the characteristics of the immune 
response generated by a vaccine 
would be important for protecting 
against severe disease if vaccination 
does not completely block infection. 
In the second case, a vaccine that 
results in exposure to a particular 
antigenic profile may be crucial for 
developing an optimally protective 
response. Spence et al. compared 
infections initiated by mosquitoes 
and by blood-stage parasites at just 
one time in the blood-stage infec- 
tion, but antigen expression can be 
highly variable in time and across 
host tissues”*, so further assessment 
of these profiles is needed. 

If antigen-expression profiles 
are indeed a major determinant of 
malaria-parasite virulence, and if 
these are not completely constrained 
by the parasite’s developmental 
requirements, we predict that natural 
selection will favour different anti- 
gen-expression profiles in different 
epidemiological settings (Fig. 1). If 
this is the case, then virulence vari- 
ability due to genetic polymorphisms 
or phenotypic plasticity will be com- 
mon in nature. This might explain 
apparently contrasting experimen- 
tal results. For example, Spence et al. 
found that mosquito transmission 
attenuated parasite replication in two clones 
of P. chabaudi, but earlier experiments using 
a different clone found no such effect’. Simi- 
larly, physicians who deliberately infected 
people with P_ falciparum to treat neurosyphilis 
reported the same clinical picture regardless 
of how the infection was initiated’. Clearly, 
much is yet to be learned about how malaria 
parasites make people sick, and about the 
role of the mosquito vector in modulating the 
disease it initiates. m 
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COSMOLOGY 


Hydrogen wisps reveal 
dark energy 


Traces of hydrogen gas, detected over vast regions of space, have for the first time 
been used as a standard ruler to measure dark energy — the unknown cosmic 
energy that is causing the Universe’s expansion to speed up. 


TAMARA M. DAVIS 


riting in Astronomy & Astrophysics 
W <=: the Journal of Cosmology and 

Astroparticle Physics, the Baryon 
Oscillation Spectroscopic Survey (BOSS) col- 
laboration reports’” on howit has used obser- 
vations of cosmic hydrogen gas to determine 
the expansion rate of the Universe back before 
the epoch of acceleration began. 

That the expansion of the Universe began 
to accelerate about 7 billion years ago is now 
an accepted part of the standard cosmologi- 
cal model. Yet what is causing the acceleration 
remains a mystery. The term ‘dark energy’ 
encompasses several theoretical possibilities, 
but to distinguish between them more infor- 
mation is needed, such as whether dark energy 
is constant or changes over time. 

So far, astronomers have been limited by 
the fact that most observations have targeted 
the relatively nearby Universe (at redshift (z) 
of less than about 1, where supernovae and 
galaxies are easy to see) or far away (at zabout 
1,100, where the remnant radiation from the 
Big Bang is seen). The BOSS team bridges this 
redshift gap, and measures cosmic expansion 
at z about 2.3, corresponding to a time when 
the Universe was less than a quarter of its 
present estimated age of 13.8 billion years. 

This stunning measurement confirms the 
existence of dark energy and, most interest- 
ingly, shows no sign that it has varied over the 
past 10 billion years. Dark energy remains con- 
sistent with Einstein’s cosmological constant — 
a result that could easily have been disproved 
with data from these distances. This measure- 
ment was made possible by the efforts of the 
BOSS team. The collaboration is in the process 
of obtaining the spectra of 1.6 million galax- 
ies and 150,000 quasars (the extremely lumi- 
nous central parts of active galaxies) using the 
Sloan Digital Sky Survey 2.5-metre telescope 
at Apache Point Observatory in Sunspot, New 
Mexico. The purpose is to determine the dis- 
tribution of matter across more than half of 
the observable Universe. The distribution is 
not random, and it holds a wealth of informa- 
tion about dark energy, dark matter and the 
strength of gravity. 

Ever since it was revealed by supernovae 
observations in 1998 that the expansion of 


the Universe is accelerating, enormous effort 
has been put into measuring this acceleration 
in enough detail to try to elucidate the cause. 
Among the primary observable parameters 
have been baryon acoustic oscillations. These 
oscillations were formed by sound waves in the 
early Universe, which was then so dense that 
sound travelled everywhere at more than half 
the speed of light. About 300,000 years after 
the Big Bang, the Universe had expanded to the 
extent that matter was no longer dense enough 
for sound waves to propagate. The waves then 
froze into place, leaving a characteristic scale 
imprinted on the primordial density distribu- 
tion from which galaxies would eventually 
form. 

Over the past decade, galaxy surveys such as 
the two-degree Field Galaxy Redshift Survey’, 
the six-degree Field Galaxy Survey’, the Sloan 
Digital Sky Survey and the WiggleZ Dark 
Energy Survey® have mapped the distribution 
of galaxies in the Universe at ever increasing 
distances. They have all revealed the charac- 
teristic baryon-acoustic-oscillation scale in 
the distribution pattern of galaxies. Using that 
scale as a standard ruler, the acceleration of the 
expansion has been beautifully confirmed with 
a precision that now equals that obtained by 
supernovae studies. 

BOSS is the next survey in that distinguished 
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line, but the team has seen even farther by 
observing quasars. The extreme brightness of 
quasars arises because of the heating of mate- 
rial actively falling into the central black holes 
of the galaxies that host them. The team used 
these quasars as ‘backlights’ to detect the wispy 
hydrogen gas that permeates the Universe 
between the galaxies. When sunlight passes 
through the branches ofa tree, the pattern of 
leaves can be inferred by the shadows that they 
cast (Fig. 1). Similarly, when quasar light passes 
through hydrogen clouds, the absorbed light 
gives a map of where the hydrogen lies. 

This is not the first time that hydrogen has 
been traced using quasars as backlights. Nor 
is it the first time that baryon acoustic oscil- 
lations have been measured. However, it is 
the first time that hydrogen has been used to 
measure baryon acoustic oscillations, and it 
is by far the most distant measurement of the 
Universe’s expansion rate so far. 

This spectacular result supports the idea that 
the simplest model of dark energy — that it 
is constant — really is the best one. It leaves 
astronomers in an interesting position. Just 
as particle physicists found the Higgs boson 
exactly where they expected to find it, cosmol- 
ogists have found dark energy exactly where 
the simplest theory predicts it to be. 

Where do we go from here? The problem 
remains that there is no good theoretical expla- 
nation for either dark energy or dark matter. 
Observations will continue to improve, but it 
is becoming clear that the real breakthroughs 
needed are theoretical. In a sense, observations 
are easy. Given more time, better equipment, 
careful analysis and increased person-power, 
measurements can always be improved. 
Theory is much more difficult. No amount of 
time will guarantee a breakthrough, and it is 
possible that the ‘next big thing’ could become 
lost in the morass of poor theories for lack ofa 
sufficiently charismatic proponent. 

In the next generation of cosmology 


Figure 1 | Useful shadows. From the pattern of shadows on the ground, one can infer the pattern of 
the leaves of a tree. The BOSS team’ has used the ‘shadows’ in spectra of distant quasars to infer the 
distribution of hydrogen gas in the Universe, and to detect the effects of dark energy. 
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experiments, observers, aware of this issue, 
are doing their best to inform theory in differ- 
ent ways. They are providing different types of 
observations that could assist in distinguish- 
ing between theories and in directing theorists’ 
investigations. 

Meanwhile, I continue to be awed by the 
fact that humans are able to measure the dis- 
tribution of hydrogen as it was more than 
6 billion years before Earth formed, and to 
relate it to sound waves in the infant Universe 


CONSERVATION 


by applying only simple physical concepts, 
such as pressure and gravity, which also gov- 
ern everyday life on Earth. That fact further 
increases my confidence in the overall picture 
that cosmology has revealed, and is inspira- 
tion enough to continue efforts to figure out 
the remaining mysteries. m 
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Spare our restored soil 


The conversion of poor-quality arable lands to grassland has prevented soil 
erosion and sequestered carbon. A study finds that greenhouse gases will be 
emitted if these lands return to cultivation, especially if they are ploughed. 


JOHAN SIX 


ture established the Conservation Reserve 
Program to revitalize degraded and mar- 
ginalized agricultural land by converting it to 
grassland. Although enthusiasm for the pro- 
gramme has been far from universal, many 
agree that it did generally rebuild soils and 
cause carbon to be sequestered within them’. 
But with a large number of the programme's 
contracts with farmers about to expire, revi- 
talized land might soon be ploughed up, with 
unknown environmental consequences. Writ- 
ing in Global Change Biology, Ruan 
and Robertson’ fill us in on one of 
those consequences: the effect on 
greenhouse-gas emissions. They 
conclude that it would be best to 
maintain these restored grasslands 
as they are, but that, if they must be 
cultivated, ‘no-tillage’ farming pro- 
duces many fewer greenhouse-gas 
emissions than does farming involv- 
ing conventional ploughing. 
Ploughing up land results in 
the depletion of soil carbon — on 
average, about 50% of the carbon 
is lost, compared with the amount 
maintained under naturally occur- 
ring vegetation*“. Emissions of the 
greenhouse gas nitrous oxide are 
also drastically increased because of 
fertilizer use and increased nitrogen 
mineralization in soil when land is 
cultivated®. The global-warming 
impact is thus drastically increased 
when land is ploughed up for culti- 
vation (Fig. 1a). Another profound 
effect of ploughing and cultivat- 
ing land is that it decreases the 
structural stability of soil, leading 


I: 1985, the US Department of Agricul- 


Global-warming impact 
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to erosion. This became a major concern in 
the 1970s. 

The Conservation Reserve Program (CRP) 
was put in place to ‘retire’ vulnerable lands by 
restoring them to grassland, with the goal of 
preserving soils and all the services they can 
provide, such as storing water, carbon and 
nutrients. At the same time, no-tillage prac- 
tices — made possible by the availability of 
herbicides — were promoted to conserve 
arable soils. More recently, no-tillage cultiva- 
tion has been advocated as a potential tool for 
mitigating or adapting to climate change. 

The effects of no-tillage practices on 
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Ploughing Conservation practice 


Time 


Figure 1 | Land-use change influences greenhouse-gas fluxes from soils 
to the atmosphere. a, Ploughing grasslands to create arable land increases 
greenhouse-gas emissions (red line) compared with those produced by 

the natural ecosystem (grey broken line), especially in the period shortly 
after ploughing, until a new equilibrium is reached. b, Emissions can be 
reduced, especially in the long term, by restoring arable land to grassland 
(yellow line), or, less effectively, by using no-tillage practices (green line) on 
the arable land. c, Ruan and Robertson’ report that, if restored grassland is 
recultivated using conventional tillage, greenhouse-gas emissions increase 
in the year after ploughing; the effect is smaller if no-tillage practices are 
used. The amount of emissions in the longer term remains unknown. 
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soil-carbon sequestration and global warm- 
ing, compared with those of conventional 
tillage, have been studied and debated exten- 
sively. Three points are generally accepted. 
First, when conventional tillage is replaced by 
no-tillage cultivation, soil carbon increases 
in the soil surface layers®, but few significant 
differences in soil carbon are found in deeper 
layers’. Second, nitrous oxide emissions might 
increase in recently converted systems, but 
are lower in the long term®*. And third, emis- 
sions of the greenhouse gas methane remain 
essentially the same’. 

Similarly, converting conventionally tilled 
systems to grasslands (as in the CRP) gener- 
ally leads to increases in soil carbon”. It is also 
expected to decrease nitrous oxide emissions 
and increase the uptake of methane into the 
soil*”, leading to a general decrease in global- 
warming impact (Fig. 1b). But no study has 
investigated the effect of converting restored 
CRP grassland to arable land on greenhouse- 
gas emissions and global warming — until now. 

Ruan and Robertson measured the main 
soil-derived greenhouse gases (carbon diox- 
ide, nitrous oxide and methane) in 
four fields that had been managed 
under the CRP. Three of those fields 
were converted to soya-bean culti- 
vation and divided into no-tillage 
and conventional-tillage plots; 
the fourth field was maintained as 
grassland. The authors also took 
ancillary measurements, such as 
soil temperature, moisture, density 
and mineral nitrogen content, all of 
which could potentially explain any 
observed differences in greenhouse- 
gas emissions between the fields. 

A limitation of the study is that 
the researchers took measure- 
ments for only one year. But such 
short-term assessments are cru- 
cial, because the greatest changes 
in greenhouse-gas emissions are 
expected shortly after grassland is 
ploughed up, and in the first year of 
cultivation as the system responds 
to the change — often undergoing 
extreme, but possibly only tran- 
sitory, changes in soil structure, 
soil nutrients and plant growth*”’. 

Ruan and Robertson observed 


that CRP-managed grasslands are a net green- 
house-gas sink, whereas soya-bean agro- 
ecosystems are a net source of greenhouse 
gases. The authors’ findings confirm that we 
will lose some of the environmental services 
provided by CRP land if we cultivate it. How- 
ever, they also report that soya-bean systems 
under no-tillage management have less than 
half of the global-warming impact of those that 
are conventionally tilled (Fig. 1c). 

As is usual for agricultural systems, nitrous 
oxide emissions accounted for most of the 
differences in global-warming impact, empha- 
sizing that this is the greenhouse gas we need 
to monitor and manage in such systems. This 
finding also begs the question: what would 
have happened if the CRP grassland had been 
converted to fertilized maize (corn) instead 
of unfertilized soya bean? The nitrous oxide 
emissions might have been even greater. But, if 
the CRP grassland had contained leguminous 
vegetation, as some CRP grasslands do, the 
differences between the cultivated and non- 
cultivated systems might have been smaller. 
However, the most pertinent remaining ques- 
tion is, what long-term effect will cultivation 
of CRP land have on greenhouse-gas emis- 
sions? Are the differences observed by Ruan 
and Robertson transitory, or will they become 
even bigger over time? 


EARTH SCIENCE 


Unresolved issues aside, the authors’ study 
does have one clear message: let us not lose the 
environmental services that have been pro- 
vided by the CRP. No-tillage practices should 
be considered to attenuate the greenhouse-gas 
costs of ploughing up CRP land, but the best 
option for the environment is to maintain the 
land under grasses. Incentives for both options 
need to be provided. m 
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Water may be 
a damp squib 


Experiments on silicon diffusion in the mineral olivine cast doubt on the widely 
held belief that water has a significant effect on the rheological properties of 


Earth’s upper mantle. SEE LETTER P.213 


JOHN BRODHOLT 


that wet quartz is much weaker than dry 

quartz, and suggested that this should 
also be the case for other silicates. Indeed, the 
authors concluded: “These observations raise 
the possibility of great weakness in the earth's 
deeper crust and outer mantle at temperatures 
far below the melting point.” In other words, 
water might control the viscosity of all rocks. 
Since then, experiments on other silicates have 
reinforced that view’, and small amounts of 
water bound into normally dry minerals such 
as olivine are now attributed with almost divine 
powers in shaping the way that Earth works. 
On page 213 of this issue, Fei and colleagues’ 
show that water may be much less important in 
controlling many large-scale processes occur- 
ring in the Earth than was previously thought. 


I: 1965, Griggs and Blacic demonstrated’ 


The ability of small amounts of water (or, 
more strictly, the concentration of hydroxide, 
OH ) to weaken minerals and rocks is known 
as hydrolytic weakening, and is implicated in 
a wide range of Earth and planetary processes. 
Indeed, it is argued that plate tectonics itself 
may owe its existence to hydrolytic weaken- 
ing*. The tectonic difference between Venus 
and Earth may be because Earth has kept 
more of its water®. Moreover, Earth could be 
subducting more water into its mantle than 
it loses from volcanoes at mid-ocean ridges 
and hotspots, thereby further weakening the 
mantle and enhancing mantle convection 
rates, despite the overall background cooling®. 
The effect of water on the viscous strength of 
rocks is conveniently called upon when nor- 
mal rock-deformation processes cannot be 
invoked as an explanation, and phrases such as 
‘water weakens rocks by orders of magnitude’ 
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50 Years Ago 


“The scientific revolution and 
leisure’ —Many of our obligations 
eat into our spare time and, ina 
large city, the amount of spare time 
which is actually free may be very 
small. First of all there is the journey 
to and from work. Moreover, when 
the worker arrives home a queue 
of domestic duties may await him: 
shopping and the payment of bills, 
repairs in the home, the care of 
children, and perhaps a visit to the 
doctor or dentist ... preparations 
for a holiday may involve a gigantic 
effort ... All the same, it remains 
true that the free time enjoyed by 
the average worker has enormously 
increased during the past century. 
A hundred years ago the average 
expectation of life at birth was 40 
years and a man worked about 70 
hours a week. To-day, these figures 
are reversed, the expectation of life 
is 70 years and the working week is 
nearer 40 hours. Although there is, 
relatively speaking, plenty of free 
time, happiness, the Holy Grail of 
the twentieth century, remains as 
remote as ever. 

From Nature 15 June 1963 


100 Years Ago 


It has been shown experimentally 
that fever is due to the digestion 
of proteins in the blood and in the 
tissues. Bacteria are living proteins. 
They get into the body and grow, 
converting the proteins of man’s 
body into bacterial proteins. After 
a period of incubation the cells 

of the body pour out a ferment 
which digests and destroys the 
bacteria. In this process fever 
originates. In itself fever is 
beneficial; it isa manifestation of 
the attempt on the part of nature 
to destroy the invading organism. 
However, nature may overdo the 
matter, and fever per se becomes 
dangerous when it goes much 
above 105°. 

From Nature 12 June 1913 
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are commonplace in Earth science. Fei and 
colleagues’ results throw a spanner in the 
works by suggesting that hydrolytic weaken- 
ing in olivine is a much smaller effect than 
generally thought. 

It needs to be said upfront that Fei et al. did 
not directly measure the viscosity of wet ver- 
sus dry rocks. They approached the problem 
from a different angle by measuring silicon dif- 
fusion. This is because the viscosity of high- 
temperature rocks is often controlled by the 
most slowly diffusing atom, which is silicon in 
the case of olivine. In dry olivine, for instance, 
the activation energies for creep and for silicon 
diffusion are almost the same, suggesting that 
creep in olivine is controlled by silicon diffu- 
sion’. Fei and colleagues found that, surpris- 
ingly, water increases silicon diffusion by less 
than a factor of ten over three or more orders 
of magnitude in water content. If the viscos- 
ity of olivine is controlled by silicon diffusion, 
then this effect is much less than required for 
the several orders of magnitude of weakening 
commonly cited in the literature. 

So which theory is right? First of all, it could 
simply be that the deformation mechanism in 
olivine is not controlled by silicon diffusion 
at all, and that the similar activation energy 
of creep and diffusion is just a coincidence. 
Another possibility is that the deformation 
experiments showing a strong hydrolytic weak- 
ening effect” were performed under water- 
saturated conditions, possibly enhancing other 
deformation mechanisms (such as sliding on 
crystalline-grain boundaries) and thereby 
producing an artificially weakened rheology. 
However, the strain rates in the deformation 
experiments should then show a dependence 
on grain size — something that the authors 
of the studies took pains to point out is not 
observed. And finally, Fei and colleagues’ diffu- 
sion experiments were performed on iron-free 
olivines; ferric iron and other ionic species may 
affect both diffusion and deformation. 

However, support for a small hydrolytic- 
weakening effect was published earlier this 
year’. These authors used a newer deforma- 
tion apparatus to measure the rheology of wet 
olivine up to a pressure of 7 gigapascals (equiv- 
alent to a depth of about 200 kilometres). First, 
they found that wet olivines were only about 
1.5 times weaker than dry olivines, and second, 
they saw no measurable dependence of viscos- 
ity on water after the first few parts per million 
or so of water. In other words, a small amount 
of weakening occurred from a small concentra- 
tion of water, after which the strength remained 
the same regardless of the water content. Cer- 
tainly, the authors did not see the large depend- 
ence of viscosity on water content seen in the 
earlier deformation experiments’. 

So is it possible that the large effect of 
water was never really there in the deforma- 
tion experiments” to begin with? Wet olivine 
is certainly weaker than dry olivine, but the 
deformation experiments on which marked 
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hydrolytic weakening is based were performed 
at relatively low pressures (less than 0.5 GPa). 
At these pressures, the solubility of water in oli- 
vine is low, restricting the range of water con- 
tents in individual studies and perhaps making 
it difficult to determine the exact dependence 
of viscosity on water concentration. However, 
it is worth noting that the amount of water in 
the latest deformation experiments’ is also 
restricted, and so the uncertainty in deter- 
mining the dependence of viscosity on water 
content could be aimed at their — opposite — 
results too. 

It is early days. Fei and colleagues’ diffusion 
experiments need to be repeated and extended 
to other compositions — particularly iron- 
bearing olivines. Deformation experiments 
need to be performed on olivines containing 
more water, and on polycrystalline material as 
well as single crystals. Deformation and dif- 
fusion mechanisms appropriate to Earth con- 
ditions and compositions need to be worked 
out. And, of course, Earth is not only olivine, 
so what about the other minerals of which it 
consists, such as pyroxenes, garnet, wadsleyite, 
ringwoodite and perovskite? How does their 
strength depend on water? 
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Ion channel 


Finally, what about all those Earth processes 
that seem to require hydrolytic weakening? It 
is worth pointing out that there are other ways 
of softening minerals and rocks. Melts, strain 
localization, grain-size reduction and changes 
in deformation mechanism can all produce 
interesting dynamical behaviour’. So, regard- 
less of whether hydrolytic weakening is or is 
not a strong effect, plate tectonics exists, and 
Venus is definitely different from Earth. = 
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twists to open 


GIRK channels allow potassium ions to cross the cell membrane, thereby 
affecting the electrical status of the cell and so its functioning. Structural data 
now provide insight into the channels’ mode of operation. SEE ARTICLE P.190 
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for the electrical activity in our body. They 

constitute a large family of some 400 proteins 
in humans. A subfamily of these proteins con- 
sists of four GIRK channels’, which specialize 
in converting chemical signals — mostly those 
of neurotransmitter molecules such as acetyl- 
choline, dopamine, serotonin and adrenaline — 
into electrical ones in heart cells and neurons. 
They are therefore essential for controlling heart 
rate and the activity of neural circuits. In this 
issue, Whorton and MacKinnon’ (page 190) 
describe the long-awaited crystal structure of 
the mammalian GIRK2 channel in complex 
with two subunits of a G protein (a dimer of the 
Gf and Gy subunits), providing information 
about their mechanism of opening*. 

Activation of GIRK channels often begins 
with stimulation of G-protein-coupled 


on channels are the main units responsible 


*This article and the paper under discussion? were 
published online on 5 June 2013. 
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receptors (GPCRs) in the cell membrane (see 
Fig. 1a of the paper’). For instance, binding 
of acetylcholine to a muscarinic-type GPCR 
ona heart cell results in release of the Ga and 
Gfy subunits of the G protein that is attached 
to the GPCR at the intracellular surface of 
the cell membrane. Gfy then activates GIRK 
channels*°, which allow efflux of intracellular 
potassium ions (K”*) from the cell, caus- 
ing hyperpolarization of the cell membrane 
(it becomes more negative inside relative 
to outside) and so reducing the cell’s elec- 
trical excitability. Acetylcholine thus slows 
the heart rate. 

Following the breakthrough discovery’ 
that the GBy dimer is responsible for open- 
ing GIRK channels after GPCR activation, 
extensive biochemical and electrophysiologi- 
cal studies have focused on the mechanism 
of activation of these channels and the role of 
associated modulatory molecules’. These stud- 
ies, however, fell short of deciphering the exact 
mode of interaction of the GBy subunits with 
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Figure 1 | The GIRK2 channel in action’. Binding of the GB and Gy subunits ofa G protein to each of 
the four GIRK2 monomers activates the homotetrameric GIRK2 channel. a, Side view of the channel, 
with the associated PIP, molecules and GBy. b, Bottom view of the same complex. Note the four-fold 
symmetry and the centre permeation pathway for potassium ions (K"). c, Models of structural 
rearrangements associated with the opening of the GIRK2 channel by GBy dimers. Looking from 

inside the cell, when Gy binds the cytoplasmic-associated domain rotates clockwise relative to the 
membrane-associated domain to widen the permeation pathway at the inner gate. The resulting pre- 
open conformation, however, is not large enough to allow passage of hydrated K* through the channel. 
Additional twisting in the same direction can widen the inner gate further, allowing hydrated K* to move 
through this open conformation. Movement from the pre-open to the open conformation is random, and 
is thought to be the mechanism governing the well-known bursting activity of the channel. 


the channel and the structural transitions that 
lead to channel opening. 

Whorton and MacKinnon describe atomic- 
level interaction of the GBy dimer with a GIRK 
channel consisting of four GIRK2 monomers 
(Fig. 1a,b). The 3.5-angstr6m-resolution struc- 
ture shows that each of the four monomers 
is bound to a GBy dimer, in agreement with 
previous biochemical evidence. They are also 
individually bound to a molecule of the phos- 
pholipid PIP, and a sodium ion, both of which 
are necessary for channel functionality. 

The Gfy dimer and the channel share a 
relatively small surface area of contact (roughly 
700 A’), compared with the footprint of other 
known Gfy interactor molecules such as Ga, 
the GPCR kinase-2, phospholipase-CB and 
phosducin. Nevertheless, the contact areas of 
each interactor with Gy overlap to various 
degrees, such that GBy cannot bind to more 
than one interactor simultaneously; this 
underlines the singularity of the GBy-mediated 
signalling event. The GBy dimer seems to 
interact with the channel at the interface of the 
channel monomers, including regions that are 
known to be involved in channel activation, 
such as the LM loop. The interaction involves 
both short-range intermolecular forces such 
as van der Waals forces and hydrogen bond- 
ing and long-range electrostatic forces, and is 
further stabilized by anchoring of GBy to the 


cell membrane through the lipid moiety in the 
Gy subunit. 

To reveal the structural changes associated 
with channel activation, Whorton and 
MacKinnon aligned three structures of 
GIRK2-PIP,: the normal channel, the 
channel in complex with GBy dimers and an 
always-active channel mutant. A comparison 
of the first two structures revealed two main 
conformational differences. On GBy-dimer 
binding, there was a clockwise (looking from 
inside the cell) rigid-body rotation of about 4° 
along the centre axis of the channel, relative 
to the transmembrane domains. There was 
also a widening of the bottom of the channel's 
inner helices — the inner helical gate — on the 
cytoplasmic side. 

The conformational changes that open 
the inner helical gate are comparable to the 
widening of a lens aperture by hand-rotating 
the aperture ring. In the resulting conforma- 
tion, however, the gate is too narrow to allow 
hydrated K* to pass through the channel. So 
how do Gfy dimers ‘gate’ the channel? Adding 
the always-active channel to the analysis pro- 
vided an answer. In this structure, the rotation 
of the cytoplasmic domain relative to the trans- 
membrane domain was more pronounced, 
causing the inner gate to widen further and 
permit the passage of hydrated K*. 

On the basis of their observations, the 
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authors formulate a GBy-dependent gating 
scheme for GIRK channels (Fig. 1c). Follow- 
ing activation of GPCRs and dissociation of 
the G protein into GBy and Ga, the free GBy 
dimer diffuses to the inner membrane surface 
and binds to the channel molecule to induce 
a ‘pre-open state. In this state, rotation of the 
channel’s cytoplasmic domain relative to its 
transmembrane domain broadens the inner 
helical gate, although the channel cannot con- 
duct K*. Nevertheless, the pre-open conforma- 
tion brings the channel to a higher energetic 
state, allowing it to make frequent random 
changes to the open conformation, by which 
K* conduction can occur. Such frequent con- 
formational changes are consistent with the 
well-characterized ‘bursting’ behaviour of 
the channel that is seen during recordings 
of single-channel activity. 

Whorton and MacKinnon’s data relate 
to homomeric GIRK2 channels, which are 
present only in selected areas of the brain®. And 
at least one other GIRK channel, GIRK1, dif- 
fers from GIRK2 in the length of its amino-acid 
sequence and in several amino-acid residues 
involved in Gy binding. So it is important to 
determine the structure of the two most pre- 
valent channel species in the brain and heart 
— the GIRK1/GIRK2 and GIRK1/GIRK4 
heteromers, respectively — in complex with 
the GBy dimer. Although the overall structural 
transitions associated with gating are likely to 
be the same, the contact surface of heteromeric 
GIRKs with the GBy dimer and the interaction 
forces involved could be different. Knowledge 
of such differences may clarify the differ- 
ent gating behaviours previously seen with 
channels of varying composition. 

Although the intimate interaction of the 
GIRK channels with GBy dimers forms the 
basis of the channels’ gating activity, direct 
channel interactions with Ga also fine-tunes 
the gating mechanism’. How Ga provides such 
control is unknown. 

More broadly, it may be possible to design 
specific molecules that could interfere with 
channel function by targeting the unique inter- 
action interface of its GIRK2 monomers with 
GBy. Such drugs would be desirable because 
they would not affect other GB y-dependent 
signalling events. m 


Eitan Reuveny is in the Department of 
Biological Chemistry, Weizmann Institute of 
Science, Rehovot 76100, Israel. 

e-mail: e.reuveny@weizmann.ac.il 


1. Hibino, H. et al. Physiol. Rev. 90, 291-366 (2010). 

2. Whorton, M. R. & Mackinnon, R. Nature 498, 
190-197 (2013). 

3. Logothetis, D. E., Kurachi, Y., Galper, J., Neer, E. J. 

& Clapham, D. E. Nature 325, 321-326 (1987). 

4. Wickman, K. D. et al. Nature 368, 255-257 

(1994). 

5. Reuveny, E. et al. Nature 370, 143-146 (1994). 

6. Ltscher, C. & Slesinger, P. A. Nature Rev. Neurosci. 

11, 301-315 (2010). 

7. Rubinstein, M. et al. J. Physiol. (Lond.) 587, 

3473-3491 (2009). 


13 JUNE 2013 | VOL 498 | NATURE | 183 


ARTICLE 


doi:10.1038/nature12295 


Locomotion dynamics of hunting in wild 


cheetahs 


A.M. Wilson!, J. C. Lowel, K. Roskilly’, P.E. Hudson'+, K.A. Golabek*+ & J. W. McNutt? 


Although the cheetah is recognised as the fastest land animal, little is known about other aspects of its notable 
athleticism, particularly when hunting in the wild. Here we describe and use a new tracking collar of our own 
design, containing a combination of Global Positioning System (GPS) and inertial measurement units, to capture the 
locomotor dynamics and outcome of 367 predominantly hunting runs of five wild cheetahs in Botswana. A remarkable 
top speed of 25.9 ms _' (58 m.p.h. or 93 kmh") was recorded, but most cheetah hunts involved only moderate speeds. 
We recorded some of the highest measured values for lateral and forward acceleration, deceleration and body- 
mass-specific power for any terrestrial mammal. To our knowledge, this is the first detailed locomotor information 
on the hunting dynamics of a large cursorial predator in its natural habitat. 


Measurements of instantaneous speed, acceleration and manoeuvring 
during athletic competition or hunting are rare’ *, even for humans, 
horses and dogs, the most studied species. The cheetah (Acinonyx 
jubatus) is acknowledged as the ultimate cursorial predator, and its 
published’ top speed of 29ms_' is considerably faster than racing 
speeds for greyhounds” (18ms_!), horses! (19ms ') or humans 
(12m s '; see ‘Analysis of Bolt’s 100m’ at http://berlin.iaaf.org/ 
records/biomechanics/index.html). Quantitative measurements of 
cheetah locomotion mechanics have only been made on captive animals 
chasing a lure in a straight line, with few studies eliciting speeds faster 
than racing greyhounds®’. For wild cheetahs, estimates of speed and 
track have been made from direct observation or film only, and are 
limited to open habitat*? and daylight hours. 


Tracking collar design 


To collect free-ranging locomotion data on wild cheetahs during hunt- 
ing in their normal environment, we designed and built a tracking 
collar similar in size and weight to a conventional wildlife collar'*" 
(Fig. 1a; mass of 340 g), equipped with a GPS module capable of deli- 
vering processed position and velocity data, and raw pseudo-range, 
phase and Doppler data for individual satellite signals at 5 Hz, and an 
inertial measurement unit (IMU) consisting of triaxial microelectro- 
mechanical systems (MEMS) accelerometers, gyroscopes and magnet- 
ometers (Methods). The collar was powered by a rechargeable battery 
charged from solar cells, plus a non-rechargeable auxiliary battery. 
Data download and configuration upload was via radio. Collar soft- 
ware monitored the accelerometers to create activity summaries and 
detect the brief hunting events, buffered accelerometer data to capture 
the start of hunts, and adapted collar operation to battery voltages, time 
of day and activity. We increased the effective sample rate of the posi- 
tioning system to 300 Hz, and reduced noise in the kinematic para- 
meters, by fusing data from GPS and the IMU with a loosely coupled 
extended Kalman smoother (Methods). This was especially important 
during hunting because GPS accuracy was degraded both during initi- 
alization, and under conditions of high acceleration and high jerk’’. 


Collection of hunting data 

We recorded GPS-IMU data from 367 runs by three female and two 
male adult cheetahs (100, 66, 61 and 84, 56 runs respectively) over 
17 months. A further 530 runs were identified in the activity data 
because the collar did not trigger on every run owing to the time of 
day and conservative trigger thresholds. An episode of feeding after a 
run indicated hunting success, and was identified in the activity data 


Figure 1 | Cheetah with collar and anatomical features contributing to 
performance. a, Cheetah with a mark 2 collar is shown. b, Gravitational and 
centripetal accelerations acting on a turning cheetah; g denotes acceleration due 
to gravity, vr | denotes centripetal acceleration, and a is the resultant 
acceleration (effective gravity). c, Non-retractable cheetah claws that enhance 
grip. d, Low posture used in deceleration, which prevents pitching and engages 
hind limb musculature to absorb kinetic energy. 


1Structure & Motion Laboratory, The Royal Veterinary College, University of London, Hatfield ALO 7TA, UK. Botswana Predator Conservation Trust, Private Bag 13, Maun, Botswana. +Present addresses: 
Department of Sport and Exercise Sciences, University of Chichester, College Lane, Chichester, West Sussex PO19 6PE, UK (P.E.H.); Botswana Predator Conservation Trust, Private Bag 13, Maun, Botswana, 
and Wildlife Conservation Research Unit, Department of Zoology, University of Oxford, Oxford OX13 5QL, UK (K.A.G.) 
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by consistent, low-magnitude acceleration on all three axes’ and was 
confirmed on a subset of hunts with field observations (Methods). 
Run routes were overlaid on Google Earth to identify terrain. The total 
number of GPS fixes recorded depended on activity, with an average 
of 180 + 171 (mean = s.d.) per cheetah per day, and a range of 7 to 
1,571. 

Runs started with a period of acceleration, either from stationary or 
slow movement (presumably stalking) up to high speed (Fig. 2). The 
cheetahs then decelerated and manoeuvred before prey capture. 
About one-third of runs involved more than one period of sustained 
acceleration (all 369 runs are presented in Supplementary Video 2). In 
successful hunts, there was often a burst of accelerometer data after the 
speed returned to zero, interpreted as the cheetah subduing the prey. 

As well as hunting runs, cheetahs play and run from larger predators, 
but we had insufficient data validated by direct observations to provide 
secure separation of these activities, although only a few runs did not 
involve the tight turns and rapid speed changes characteristic of hunt- 
ing (for example, runs 5, 32 and 49 in Supplementary Video 2). We 
therefore compared successful hunts to all other runs recorded by 
the collar. In total, 94 of the 367 runs (26%) were successful hunts. 
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Figure 2 | An example day and hunt. a, Track of cheetah over 11 h (GPS data 
are available as a Google Earth file in Supplementary File 1). Each circular mark 
represents a GPS-derived position. Cheetah track and marks are colour-coded to 
collar state (detailed in Supplementary Fig. 1) as follows: alert, blue; mooch, green; 
ready, yellow; chase, red. b, Hunt track magnified from bottom right of a, hunt 
track is anticlockwise and marked with an arrow. Warmer (bright red) colours 
on track represent higher speed. c, Activity summary calculated in the collar from 
the accelerometer (Methods) for the 11-h period shown in a; shaded regions of 
the graph represent collar states as labelled. Line colours: peak accelerometer 
signal amplitude recorded in each 30-s period X, blue; Y, green; Z, red; mean of 
peak amplitude values extracted for each 2-s in each 30-s (that is, 15 bins) period 
X, cyan; Y, magenta; Z, black. The relative values for each axis differentiate 
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Including the 530 additional runs detected solely from IMU data did 
not change the success rate (223 out of 897; 25% success), which is 
lower than previously reported for individual cheetah*’*’*, perhaps 
due, in part, to the inclusion of non-hunting runs. Cheetah are reported 
to move in predominantly open habitats using vegetation-edge to stalk 
their prey, often at dawn and dusk*"*"°, Although almost half of the 
runs here occurred at/after dawn, runs occurred throughout the day 
and night (Fig. 3e). The individual cheetahs varied in their predilection 
for running in open grassland or dense shrub (Supplementary Fig. 6). 
On average, the cheetahs ran most often in open habitat (48%, 176 of 
367 runs); 28% of runs occurred in open shrub/around large trees, and 
24% occurred within dense vegetation. Only 20% of runs occurring in 
the open grasslands were identified as successful hunts, compared with 
31% of runs in dense cover. This difference in outcome was not sig- 
nificant (P = 0.054, chi-squared test) and is confounded by individual 
variation and habitat, but it does demonstrate that cheetahs do hunt 
successfully in all terrains*!’. Vegetation may confer an advantage by 
permitting stalking and limiting prey options for escape by manoeuv- 
ring; however, there was little difference in the distance or speed 
between terrains (Supplementary Table 1). 
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between a single high-acceleration cycle and consistent movement in the 30-s 
window. Coordinate system: X lateral, positive left, Y fore—aft, positive forwards, 
Z vertical, positive upwards. Time is local (coordinated universal time (UTC) 

+ 2h). ‘Hunt time is labelled. d, Doppler-derived velocity profile for hunt 
determined by the GPS receiver at five updates per second. e, GPS-IMU-derived 
velocity profile for the chase; in b, d and e warmer (bright red) colours represent 
faster speeds. f, Accelerometer data recorded at 300 Hz for chase; X, blue; Y, green; 
Z, black. Red circles indicate forward acceleration peak used as event marker for 
stride cutting at, approximately, hindlimb foot contact. The high accelerations at 
zero velocity at t = 12-13 s suggest subduing prey and a successful hunt. An 
animation of a hunt is in Supplementary Video 1, plots of further runs are 
available in Supplementary Fig. 5, and all runs are in Supplementary Video 2. 
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Figure 3 | Descriptive hunt statistics. a, Top speed, averaged over a stride, 
reached in each run colour-coded for outcome. b, Distance covered in each run. 
c, Top speed in each run coded for terrain type. d, Peak acceleration and 
deceleration recorded in each run. e, Plot of time of day of runs recorded in 
period when collar was set to trigger at any time of day, time local. f, Example 
hunt file colour-coded for speed (bright red denotes fastest), and with 
horizontal acceleration vectors drawn, to scale, for each stride. n = 367 

(a-d) and n = 254 (e). 


Description of hunts 


The average run distance was 173m (+116m) (Fig. 3b) though 
recorded run distance will be shorter than the true value in the runs 
where the start of the run was missed (Methods, Supplementary 
Video 2). The longest runs recorded by each cheetah ranged from 
407 to 559 m; the mean run frequency (including information from 
activity data) was 1.3 times per day, so, even if some hunts were 
missed, high speed locomotion only accounted for a small fraction 
of the 6,040-m average daily total distance covered by the cheetahs. 
The mean top speed was 14.9 + 3.4ms_' and was usually only sus- 
tained for 1-2 s. The highest speed we recorded was a stride-averaged 
25.9ms | in run 250 (Fig. 3a, cand Supplementary Video 2). The top 
speeds attained by the other cheetahs were 25.4, 22.0, 21.1 and 
20.1ms ‘'. The cheetahs studied here mostly hunted impala 
(Aepyceros melampus)'’, which made up 75% of their diet, although 
one male cheetah (Qamar), which frequently hunted in thicker vegeta- 
tion (Supplementary Fig. 6), never exceeded 20.1 ms ' and was often 
observed on warthog (Phacochoerus africanus) kills. Cheetah hunting 
the (anecdotally) faster Thompson’s gazelle (Eudorcas thomsonii) on 
open East African savannah may use higher speeds. 

Successful hunts involved greater deceleration on average (—7.5ms 
versus —5.5ms 7; P<0.05; Fig. 3d), but there was no significant 
difference in peak acceleration (Fig. 3d), distance travelled (Fig. 3b) 
number of turns (6.7 versus 6.5) or total turn angle (347° versus 260°) 
(generalized linear mixed model (GLMM); Methods). This indicates 
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that outcome was determined in the final stages of a hunt rather than 
hunts being abandoned early to save energy or reduce risk of injury, 
and the higher deceleration values may reflect actual prey capture. 
Equivalent locomotion and outcome data for coalition-hunting cheetah 
might clarify the importance of the final manoeuvring phase in hunt 
outcome. 


Comparison with other athletic animals 


The greatest acceleration and deceleration values were almost double 
values published for polo horses’® and exceeded the accelerations 
reported for greyhounds at the start of a race’*. The cheetahs sped 
up byupto3ms ‘andslowedbyupto4ms inasingle stride (Sup- 
plementary Fig. 5d). Mass-specific change in kinetic energy over a stride 
(Fig. 4c and Supplementary Fig. 7) exceeded 30Jkg ' stride ' across 
the broad speed range of 10 to 18ms__'. On the basis of forward accele- 
ration, the greatest stride-averaged whole animal powers often 
exceeded 100 W kg ~ : (body mass) (Fig. 4d), and also occurred between 
10 and 18ms 7. For comparison, we calculated a stride-averaged 
power of 25 Wkg ' for Usain Bolt’s 9.58-s 100-m world record (Methods 
and http://berlin.iaaf.org/records/biomechanics/index.html), consist- 
ent with other measurements on human sprinters’; polo horses achieve 
30Wkg ' (ref. 18) and racing greyhounds 60 W kg | (ref. 18). 

The locomotor (limb and back) muscle accounts for 45 + 4% of 
body mass”! in captive cheetah. The wild cheetahs had similar limb 
and back lengths to those captive cheetahs, but were heavier at 53 kg 
versus 33 kg (means, n = 5, 5), and visibly more muscled (mean mid- 
thigh girth 540 mm versus 450 mm, n = 5, 5), so much of their body 
mass is locomotor muscle. Major propulsive muscles such as the 
hamstrings (biceps femoris, semimembranosus and semitendinosus) 
at the hip and gastrocnemius at the tarsus have 64% and 60% longer 
moment arms, respectively, than in the greyhound and similar muscle 
fibre lengths*’. Stride frequency and posture are similar at the same 
speed in the two species’ so the muscle sarcomeres (and fibres) will be 
shortening considerably faster in the cheetah than in the greyhound at 
the same speed (like the engine of a car in a lower gear). This fast 
muscle contraction velocity will enable large muscle powers and 
hence deliver the very large acceleration powers observed”. The high 
muscle speed and power are consistent with our measurements on 
contracting skinned fibres from cheetahs”. The cheetah deceleration 
magnitudes (Figs 3d and 4b), cycle works (Fig. 4c) and powers 
(Fig. 4d) were greater than during acceleration and up to three times 
higher than polo horses’*; however, comparative figures are sparse. 
Cheetah can crouch to engage locomotor muscle to enable these 
deceleration magnitudes (Fig. 1d), and sliding or colliding with the 
prey may dissipate some energy. 


Grip and manoeuvrability key to hunting success 


Hunts involved considerable manoeuvring, with maximum lateral 
(centripetal) accelerations often exceeding 13ms ”* at speeds less 
than 17ms_ (Fig. 4e, f; polo horses achieve 6 ms *; ref. 3). A lateral 
acceleration of 13ms * (Fig. 1b) requires a coefficient of friction with 
the ground of at least 1.3. Ridged footpads and substantial claws** 
(Fig. 1c) act as cleats to augment friction and deliver this level of grip. 
The maximum centripetal acceleration observed was smaller at speeds 
greater than 17ms__ (Fig. 4e), which may be behavioural in origin; 
that is, cheetahs do not perform tight turns at their highest speeds. 
Studies on other animals show that, although grip limits turning 
performance at low and moderate speed, a model based on the capacity 
of the limbs to withstand the combination of centripetal acceleration 
and gravity (Fig. 1b) is appropriate to account for reduced speed on 
bends in humans, mice and racehorses*”*-*’ but not greyhounds”. The 
dashed line labelled LFL (leg force limit) in Fig. 4e is calculated using 
published models*”*”’, published stride data’ and the maximum speed 
recorded here. The equations and assumptions are presented in the Sup- 
plementary Information. The LFL line seems to follow the upper bound 
of the data points at higher speeds, but confident verification would, 
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Figure 4| Performance summary. a, Stride frequency plotted against speed; 
each point is colour-coded for tangential (forwards) acceleration, bright red 
points represent the greatest forward acceleration, and are plotted last (on top). 
Lines are linear regression of stride frequency against speed for each individual 
cheetah. b, Tangential (forwards, positive) acceleration and deceleration (y axis) 
against speed (x axis). Horizontal lines represent acceleration and deceleration of 
13ms *, equating to the proposed grip limit of 1.3 (see text). Curved lines 
represent stride-averaged whole-body powers of +30, 60, 90 and 120 W kg” I, 
points outside the outer dashed line equate to a mean stride power in excess 
of +120Wkg '. c, Body, mass specific, horizontal kinetic energy change 
performed in each stride (work per stride). d, Stride-averaged whole-body 
acceleration power plotted against speed, with horizontal lines showing powers 
of +30, 60, 90 and 120 Wkg 1. e, Horizontal speed against turn radius, region 
around origin magnified in inset. Slanting straight lines show different rates of 
heading change in degrees per second, with values (2, 6, 10, 16, 25, 43 and 112) at 


however, require stance times or limb forces during manoeuvring”. 
When combined with gravity, a lateral acceleration of 13 ms” equates 
to a 66% increase in the cheetah’s effective weight and hence average 
limb force (Fig. 1b). Cheetahs have relatively large limb bone cross- 
sectional areas (compared with greyhounds’), which may be an 
adaptation to resist the large peak limb forces that occur during high 
speed manoeuvring. 

The cheetah should run little faster than its prey in the manoeuv- 
ring phase of the hunt’*” if it is to capture an agile and quick-turning 
prey. A cheetah running at 25.9ms | with the maximal observed 
lateral acceleration of 13 ms * would have a turn radius of 52m and 
would take 6 s to perform a 180° turn (zr v_')—peak running speed is 
therefore unlikely to be, and was not found to be, a feature of the final 
stage of successful hunts. A cheetah can slow by 4ms_ ' in a stride 
(Supplementary Fig. 5d), and the cheetahs often decelerated sharply 
before turning, which would enable much tighter turns. Slowing from 
16ms_ ‘to4ms ' (three strides, 1 s) would drop the turn radius with 
vri=13 (lateral acceleration of 13 m s ’) from 19.7 m to 1.2m, and 
heading velocity (vr~') would rise from 46 to 190° s'. This demon- 
strates the value of slowing down before manoeuvering. The cheetahs 
did not use highest tangential and centripetal accelerations simulta- 
neously, consistent with grip limiting maximal horizontal acceleration 
(there are few data points in the corners of the square in Fig. 4f). Rapid 
deceleration would unload the hindquarters, which could result in yaw 
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the top of the line. The solid curved line (41 = 1.3) represents a grip limit/ 
coefficient of friction of 1.3; the curved shorter-dashed line (j = 0.6) denotes the 
0.6 grip limit reported for polo horses’; points above each line require a higher 
grip level. The curved longer-dashed line (LFL) represents a limit to turning 
defined by the maximum force the legs can withstand. f, Plot of tangential 
acceleration against lateral acceleration. Total horizontal acceleration is the 
distance from the origin, circles represent mean total horizontal acceleration of 6 
and 13ms ” (equating to average grip limits of 0.6 and 1.3). Each point on each 
plot represents data centred on a single stride, with data smoothed over three 
strides. Points are colour-coded by individual, except in plot a. The number of 
strides from each cheetah were 5,031, 4,022, 3,211, 2,657 and 1,895 giving a total 
n of 16,816 for plots b, c, d and f. The total 7 is given in each plot and was slightly 
different for plots a and e owing to the mathematics of generating those plots but 
the individual contributions were in proportion. 


instability when manoeuvring because the centre of mass (COM) is 
behind the forelimbs (like a ground loop in a tail wheel aircraft). The 
pitch limit proposed in ref. 18 may apply at low speed, but insufficient 
low-speed data exist to consider this further, and it can be circum- 
vented by posture due to the cheetah’s flexible spine (Fig. 1d). The 
active movements of the high-inertia tail that are observed in wildlife 
documentaries will help in positioning and banking the body (and 
limbs) to apply appropriate forces to prevent this and for turn initiation 
and manoeuvring. 


Perspective 

Equivalent data for other wild cursorial species would enhance what 
we know about natural speed, agility, endurance and locomotor phy- 
siology, and provide detailed information on ranging behaviour in the 
wild. For example, such fine-scale data on habitat selection by endan- 
gered species detailing where animals are commuting, hunting and 
resting will be informative when attempting to evaluate landscape 
scale connectivity, corridors and wildlife-protected areas. Tightly 
coupled GPS-IMU processing can deliver 0.2-m position accuracy 
(the level of individual shrubs and footfalls) during hunts, enabling 
detailed analysis of context variables (such as habitat characteristics 
and prey visibility), modes of hunting success and failure, and the 
effect of slope, camber and foot-surface interaction on stride-by-stride 
performance. These data on hunt environment would inform about 
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the determinants of preferred hunting habitats, risk of injury (of 
paramount importance for solitary predators), risk of detection by 
Kleptoparasites (open versus closed habitat), available palatable graz- 
ing and habitat-dependent risk of predation (detection). 


METHODS SUMMARY 


Collars moved between six operating states depending on the time of day, the 
activity level of the cheetah, and battery voltages (Supplementary Fig. 1). If the 
cheetah were active (detected via accelerometers) at a time when hunting was 
likely, accelerometer data samples were continuously buffered in memory, and 
the GPS module was regularly triggered (‘refreshed’) to maintain an internal state 
ready for immediate start-up. When a run started, GPS data at 5 Hz and full IMU 
data at 300 Hz were recorded. The GPS-IMU data were post-processed in a 
loosely coupled extended Kalman smoother optimized for sensor characteristics 
(Methods) and cheetah dynamics. Horizontal position error (median stride-wise 
standard deviation (s.d.), n = 45,851) was reduced from 5.05 m (pure GPS data) 
to 0.67 m in the smoothed solution. Speed error was reduced from 1.23 ms | to 
0.34ms ' (Supplementary Fig. 3). The initial seconds of the run were recon- 
structed by open-loop inertial integration, backwards in time, using buffered 
IMU data and smoothed GPS-IMU data for initial conditions. Data were seg- 
mented into strides using the horizontal acceleration signal, and a rolling average 
was applied to the stride duration, speed and heading rate data (methods) to 
ensure that cutting did not result in erroneous extreme values in these or derived 
parameters (Supplementary Fig. 4). Activity summaries, based on accelerometer 
readings, were recorded for each 30-s period throughout the rest of the day, with a 
GPS position every 5min when the cheetah was on the move. The dynamic 
performance of the collar for track and speed was verified by running a dog on 
a beach (Supplementary Fig. 2); footprint position in the sand was determined 
using survey-grade GPS, and footfall time from GPS time-stamped high-speed 
video. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Animals. The cheetahs used in this study were part of a continuing study by 
Botswana Predator Conservation Trust (http://www.bpctrust.org) in the Okavango 
Delta region of Northern Botswana. Initially, three ‘mark 1’ prototype collars were 
fitted to three cheetahs in July 2011. All collars successfully collected data as inten- 
ded, two collars for 7-9 months whereas the third suffered a memory card failure 
after 6 months. Three collars ofa new ‘mark 2’ design were used in April 2012, and 
two more collars in July 2012 (fitted to the original three cheetahs plus two new 
individuals). Data were again successfully collected from these collars, and they 
continue in operation. 

The cheetahs were immobilized by free darting from a vehicle by A.M.W. using 
medetomidine (2 mg) and ketamine (80-120 mg) and reversed after 60 min with 
10 mg atipamezole. While sedated, dimensions including limb lengths, thigh 
girths and back lengths and body mass were recorded. Collar data were down- 
loaded by radio link every few weeks to a ground vehicle or a light aircraft. 
Collar design and fabrication. The major design challenges included the measur- 
ement and logging of data at a sufficiently high rate and accuracy, timely remote 
retrieval of substantial volumes of data from the collar and maintaining the very 
low average power consumption required in a wildlife collar. To conserve power, 
careful management of the internal readiness of the GPS subsystem allowed this 
and other sensor systems to be started quickly enough to capture data at maximum 
rate only during these events. 

All collars were constructed in-house. In the original collars (mark 1, used in 

2011), a commercial radio-tracking collar (Sirtrack, New Zealand) was used as a 
base, our custom electronics package being mounted on the top of the collar in a 
clear cast resin case and wired to the collar’s original battery box at the bottom of 
the collar. The revised mark 2 collars (Fig. 1a) were entirely constructed in-house, 
with a revised lower-profile electronics enclosure (cast from polyurethane resin 
using a silicon mould and a rapid prototyped former; Aprocas GmbH) and a 
vacuum-formed polycarbonate battery box holding larger rechargeable and back- 
up battery in potting compound. The actual electronics package was similar on 
both versions, with an identical chip set as described below, and with almost 
identical software functionality. Collar mass was approximately 340 g. 
Collar design: electronics payload. The collar circuit was based around a low- 
power MSP430 16-bit microcontroller (Texas Instruments), running custom 
software written in the ‘C programming language developed using an integrated 
development system from IAR Systems. The microcontroller contains several 
internal peripheral blocks, including an 8-channel 12-bit analogue-to-digital con- 
verter (ADC), four serial communications modules, plus various timers, general- 
purpose digital input and output lines, and other support modules. A connected 
2-GB micro-SD flash memory card (Sandisk) provided data storage. 

GPS position was obtained from an LEA-6T GPS module (u-Blox AG). In 
addition to internally computed position and velocity, the module is able to 
generate raw pseudo-range, phase and Doppler data for the signal from each 
satellite enabling detailed GPS performance evaluation, and use of customized 
differential techniques for increased accuracy. The data rate was five position, 
velocity and raw data points per second during continuous operation (for 
example, during a chase). 

The collar circuit also included an inertial measurement suite, based on MEMS 
devices. Acceleration was measured using an MMA7331 three-axis accelerometer 
module (Freescale Semiconductors), providing acceleration with a +12 g range. 
The roll and pitch rotation rate was measured by a dual-axis gyroscope (ST 
Microelectronics), and yaw rotation rate by a single-axis gyroscope (ST 
Microelectronics), both set to the 2,000° st range. Sensor outputs were filtered 
by simple single-pole analogue filters (100 Hz knee), and then sampled by the 
microcontroller ADC at 300 or 100samples per second (Accelerometers or 
Gyroscopes, respectively). Three-hundred hertz was chosen as giving an over- 
head to a frequency of 30 Hz; that is, 1/minimum published stance time’. A three- 
axis magnetometer (Honeywell), connected via I°C, provided magnetic compass 
functionality at 12 measurements per second. 

Primary communication with the collar, for tasks such as data file download 
and configuration file upload, was via a 2.4-GHz chirp-spread-spectrum com- 
munication module (Nanotron Technologies Gmbh), communicating at 1 Mbit 
per second using a custom communications protocol. A 173-MHz VHF radio 
transmitter (Radiometrix) provided longer-range transmission of current GPS- 
derived position, for tracking purposes. An original equipment manufacturer 
(OEM) conventional wildlife tracking transmitter in the 149-MHz band 
(Sirtrack) facilitated long-range animal location using conventional direction- 
finding techniques. 

Collar design: power. Primary power supply for the collar was a 900 mAh 
lithium-polymer rechargeable battery (Active Robots), charged by a solar cell 
array consisting of 10 monocrystalline silicon solar cells (Ixys Koria). On the 
mark 2 collars, a 13 Ah lithium thionyl chloride primary battery (Saft) provided 


a back-up power source (on the original collars, a 7.7 Ah lithium thionyl chloride 
primary battery was used). Both battery voltages, together with the charge current 
from the solar cell array, were measured by the microcontroller, which switched 
the collar electrical load from one battery to the other depending on battery state. 
Collar design: software states and movement detection. In operation, the collar 
software moved between several different operating ‘states’, the particular state at 
any moment being dependent on a combination of animal activity level (measured 
using the accelerometers) and time of day (from a GPS-synchronised software 
clock). Each state required a different mix of hardware sub-systems to be powered 
on or off, and different intervals between GPS module operation, and thus the power 
consumption of the collar varied depending on the operating state. Thus, the 
inevitable compromise between average power consumption on the one hand, 
and quantity and resolution of data gathered on the other, could be optimized 
by setting the parameters for the state transitions. The different operating states 
and associated average power consumption for the collar are summarized in 
Supplementary Fig. 1. 

To keep the average power consumption as low as possible, the collar would 
generally default to operating in state 1 (‘alert’ state). In this state, to detect when 
the cheetah was moving, the accelerometer was sampled at 30 Hz for a period of 
10s in every minute. Within each 10-s sampling period, the peak-to-peak accele- 
ration was computed for each axis every 2 s, and an accumulator incremented bya 
specified value for each 2-s window in which the peak-to-peak acceleration 
exceeded a pre-set threshold; For each 2-s window in which the peak-to-peak 
acceleration did not exceed the threshold the accumulator was decremented by a 
(different) specified value. Thus, periods of movement could be given higher 
‘weight’ than periods of no movement or vice versa to identify stalking. If the 
accumulator total exceeded a specified value, the cheetah was deemed to be 
consistently moving and the collar switched to a higher operating state, the exact 
state depending on time of day. A similar algorithm with different weights and 
thresholds was then used to determine when the animal had settled back to rest, at 
which time a switch back to the lower state was executed. 

When consistently moving between local times of 06:00 and 09:00, and 17:00 

and 19:00 (times when hunting was most likely from previous work), the oper- 
ating state would transition to state 3 (‘ready’ state). The GPS was refreshed every 
30s and position recorded every 60s. Accelerometer data were recorded into a 
circular buffer at 100 Hz, the buffer storing the latest 3s of data. If the fore-aft 
accelerometer data then exceeded a threshold equivalent to galloping, state 4 
(‘chase’ state) would be entered. The buffered data were stored and 5 Hz GPS 
data, 300 Hz accelerometer, 100 Hz gyroscope and 12 Hz magnetometer data 
recorded. A record was defined as valid if five further peaks (strides) were 
detected, and then recording would continue until there were no peaks above 
the threshold for 5 s. When moving consistently but outside of the peak hunting 
times, the lower-powered state 2 (‘mooch’ state) would be invoked, with GPS 
positions being taken every 5 min and simple activity measurements being taken 
as described below. The GPS delivered a first fix in 1.30 s after triggering (median), 
accurate position data (<10 m s.d.) after 1.58 s, and full rate data (5 Hz) after 5.4s 
(Supplementary Fig. 3). The unexpectedly long delay in the GPS module delivering 
5 Hz data prevented open-loop GPS-IMU integration back to the beginning of the 
run in some cases. This is why many runs in Supplementary Video 2 do not start at 
low speed. 
Collar power handling and power consumption. Average collar power con- 
sumption varied between individual animals (owing to differing patterns of 
activity and hence a different distribution of collar operating states), but was 
typically around 4mA when averaged over 24h. The main contributor to this 
average was the time spent in the ready state when the animal was active during 
hunting times of day (Supplementary Fig. 1), in which average consumption was 
around 16 mA with a 30-s GPS refresh time. By comparison, the time spent in the 
mooch state (animal active but outside hunting times) had a lower consumption 
of about 5 mA, whereas ‘sleep’ or alert states (animal inactive) contributed only 
about 0.6 mA. The ‘chase’ state, used only when the animal is running, required 
some 90 mA, but time spent in this state was very small. Solar charge currents 
ranged from 35 mA with the animal in full sunlight, to typically 10 mA in dappled 
shade and almost zero in deeper shade. Average charge current over a 24-h period 
was typically 2 mA, with some variation between animals due to terrain prefer- 
ences, indicating little time spent in full sunlight even in the winter study period. 
The solar cells, via the rechargeable battery, contributed roughly 75% of the collar 
power, the remainder being supplied by the non-rechargeable battery. Collar 
battery life was predicted at approximately one year with these settings, but was 
very dependent on collar settings and animal behaviour. 

On cheetahs four and five, the ready state GPS refresh interval was changed 
from 30s to 300 s—this resulted in a typical power saving of around 30% over a 
24-h period, with unexpectedly little effect on GPS start-up time (Supplementary 
Fig. 3f). We reduced power consumption on mark 2 collars (254 runs) by not 
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pre-buffering data, and moving directly from mooch to chase state (and allowing 
this to happen at any time of day, enabling Fig. 3e to be generated), so that IMU 
data logging began on the first accelerating stride when the cheetah was already in 
motion. The time that could be recovered through backwards integration was 
therefore reduced, and the first 1-2 acceleration strides lost. 

Collar design: generation of activity summaries. Throughout all states, a back- 
ground measurement of animal activity was also recorded. For every 2-s ‘window’, 
the maximum peak-to-peak acceleration range is recorded separately for all three 
accelerometer axes. After 15 ‘windows’ have passed, an activity record is generated, 
containing GPS time, the largest X, Y and Z peak-to-peak acceleration amplitudes 
seen in any of the 15 windows, and the average of the 15 2-s peak-to-peak X, Y and 
Z accelerations amplitudes. This enabled differentiation of transient high accele- 
ration events and consistent activity. This record is generated continuously in the 
mooch and ready state, every 3 min in the alert state, and every 30 min in the sleep 
state. Amplitudes are higher than body acceleration, because the collar can move 
relative to the centre of mass. 

All settings that affected the state transitions (times, acceleration thresholds, 

and so on), and many other settings besides, could be modified by uploading a 
new configuration file over the 2.4-GHz communications link. In addition, a 
complete new version of the collar firmware could be uploaded over this link, 
allowing for in-field program updates while the collar is on the animal. 
Sensor fusion and signal processing to capture hunting dynamics. In the collar 
data collected here, the power management features used gave different sampling 
rates for accelerometer (300 Hz) and gyro (100 Hz) in the chase state. To capture 
the full acceleration profile within the microcontroller, 3s of accelerometer mea- 
surements were continually buffered in ready state at a reduced sampling frequency 
(100 Hz) and recorded when entering the chase state (gyro-power consumption 
was too high to permit continuous pre-buffering). GPS position and velocity mea- 
surements were usually (but not always) available within 1s after entering the 
chase state (Supplementary Fig. 3). 

The unique characteristics of these data required a custom-designed GPS-INS 
(inertial navigation system) integration method written in Visual C++ and 
MATLAB. Calibrated IMU measurements were first linearly interpolated to 
300 Hz. Orientation changes were assumed to be minimal during the buffer 
period, and hence the unmeasured gyro angular rates assumed to be zero. GPS 
and IMU measurements were fused using a 12-state extended Kalman filter*® in 
loosely coupled architecture. The total state formulation used propagates position, 
velocity and orientation states with time using the IMU measurements in a sim- 
plified form of the strap-down inertial navigation equations’. The associated 
process noise was estimated from the known error characteristics of the inertial 
sensors used. GPS position and velocity updates were used as measurement 
updates, and receiver accuracy data for each fix used to estimate measurement 
noise to appropriately weight the GPS to the inertial solution. 

The filter was run in reverse time from the last GPS observation of each run to 
the beginning of the buffered inertial data. During the short time period in which 
only inertial data was present (throughout buffer and between GPS measure- 
ments), the filter propagation was equivalent to open-loop inertial navigation. 
The filter was initialised using last GPS position and velocity data, and Euler 
angles assumed zero with covariances appropriate for the uncertainty in that 
assumption. A Rauch-Tung-Striebel (RTS) smoother” was then applied in for- 
ward time on the Kalman-filtered data. This is equivalent to combining backward 
and forward solutions, effectively halving the open-loop INS integration period 
between GPS observations. It was not always possible to reconstruct the period 
before the first GPS observation, as this period was often too long or the accuracy 
of the initial GPS observations insufficient (Supplementary Fig. 3c-f). This will 
result in a somewhat short measurement of hunt distance in those cases (apparent 
qualitatively in Supplementary Video 2). 

GPS-INS processing was used to reduce noise and improve precision in the 
position and velocity solution (Supplementary Fig. 3), as well as increasing the 
temporal resolution of the data. It also allowed determination of orientation, 
which is otherwise not directly measured. Because the GPS receiver also records 
raw pseudorange, Doppler and carrier phase measurements for each satellite, 
future data processing may use a stationary reference station to calculate a more 
accurate differential GPS solution. Use of a tightly coupled GPS-INS solution 
may also provide increased accuracy and robustness, especially during periods 
when a reduced number of satellites are tracked (for example, turns). 
Extraction of parameters for analysis: speed, distance and stride timing. Stride 
timings for data cutting and stride frequency were determined from the axis of 
accelerometer aligned approximately in the cranio-caudal direction. These accel- 
erations were first low-pass filtered at twice anticipated stride frequency (8 Hz), 
and a peak detection algorithm was used to detect forward acceleration peaks at 
minimum duration of 0.2-s apart (equal to a maximum stride frequency of 5 Hz). 
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Horizontal speed was calculated from filtered velocity and averaged over the 
calculated strides (v;) to remove the effects of speed fluctuation through the 
stride and collar oscillation relative to the centre of mass. These data were then 
smoothed with a rolling average (see below). Run distance was calculated by zero- 
order hold integration of the stride averaged horizontal speeds over the duration 
of the run. Maximum speed during each run was determined from these values. 
Stride frequency was calculated from the duration between stride timing peaks. 
For consistency in comparison, other parameters were then determined using the 
same method as in ref. 3, using only two-dimensional position and speed mea- 
surements. Position data were first down-sampled to the calculated stride times. 
The displacement vectors between consecutive positions were then calculated: 
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in which P; is the two-dimensional position at sample/stride i. 

Extraction of parameters for analysis: acceleration and power. A signed change 
of heading (A0;), and hence heading angular velocity (~;), were then calculated 
from the angle between the two vectors: 
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in which AT is the sampling interval. 

The tangential or forward acceleration (a;,;) and centripetal acceleration (a,.;), 
as well as instantaneous turn radius (r;) were then calculated: 
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Finally mass-specific COM power was calculated as the dot product of stride 
averaged acceleration and stride averaged velocity (that is, multiply forward 
acceleration by forward speed): 


k; =SAjVji 


Mass-specific COM stride work (net COM kinetic energy change in a stride) was 
calculated as change in speed over a stride multiplied by stride average speed. 
Extraction of parameters for analysis: improving accuracy through averaging. 
One important consideration when calculating heading, change of heading, and 
heading angular velocity from position measurements is that accuracy will 
decrease as speed decreases. Although averaging over a stride and across strides 
markedly improves the accuracy, lower average speed values will still be less 
accurate. The noise present is of a level that does not unduly influence extreme 
values even at very low speeds. 

Although validations carried out on the stride timing show that it is generally 
accurate (Supplementary Fig. 2f), detection of an incorrect or spurious peak for 
end of stride would result in one stride duration being under or overestimated, 
and the adjacent stride duration being affected in the opposite manner. This 
would introduce error in parameters that do not change smoothly through a 
stride, such as acceleration and kinetic energy. We therefore applied a weighted 
average in which the stride period was averaged, with the mean of the duration of 
the preceeding and following stride. The weighted average was of the form: 


Siw = 0.58; 1 + Si +0.5S;41 


in which S represents the parameter being weighted, and i is the stride number. 

This approach was used as follows: tangential acceleration and hence accelera- 
tion power were calculated based on a weighted average stride speed. Centripetal 
acceleration was based on weighted stride speed and weighted heading rate. Stride 
duration was also weighted. Where these parameters have been plotted against 
horizontal speed, the weighted stride speed was also used. Applying more averaging 
than this did not change the distribution of outliers to a discernible extent 
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(Supplementary Fig. 4), but applying no averaging did result in more outliers 
giving us confidence in our extreme values with this treatment. 

Extraction of parameters for analysis: grip and manoeuvring. Maximum trac- 
tion has been proposed as a potential constraint to turning performance’. Coefficient 
of friction, 1, is the maximum achievable ratio of horizontal force (acceleration) 
with respect to vertical force (acceleration). Average vertical force is equal to 
acceleration due to gravity and assuming that vertical and horizontal forces are 
always in proportion: 


ma 
kea— 
mg 
So that maximum horizontal force and horizontal acceleration (a) are: 
MAmax = LN 


Amax = Ug 


in which g is acceleration due to gravity, and m is mass. Substituting for horizontal 
acceleration in terms of tangential (a;) and centripetal components (a,): 


\/ a? +02 = Wg 


This demonstrates the potential trade-off between tangential and centripetal accel- 
erations. Given that maximum centripetal acceleration will occur at constant speed 
(a; =0), and likewise that maximum tangential acceleration will occur in a straight 
line (a, =0): 


4, max = US 
4t, max = HY 
Remembering that centripetal acceleration: 
ac = — 
? 


in which v is horizontal speed, and r is radius of turn. We form an equation for 
maximum speed (Vmax) in terms of turn radius (7): 


2 
Vin 


mae he. 


Vmax = / Ugr 


A maximum limit for tangential acceleration based on maximum available muscle 
power (K) is derived as follows. When force and velocity are in the same direction: 


K=Fy 
K=mayv 


Where F is force magnitude, v is horizontal speed, a; is tangential acceleration and 
m is body mass. Given specific power by body mass (k): 


Substituting gives: 


Geometric limit to acceleration. A pitch limit for acceleration was previously 
proposed” that assumes that propulsion is derived purely from hip extension. 
This gives an acceleration limit for greyhounds of 10ms_ ~ at all speeds derived 
from back length and leg length, and the limit for cheetahs would be similar as 
body height and length are similar’. Such a limit is not exceeded in our data 
(Fig. 4b), but there are few low speed acceleration strides. 

Collar validation. A lurcher (greyhound/whippet/terrier cross in this case) dog 
was fitted with a mark 2 collar and encouraged to undertake maximal accelera- 
tions and sharp running turns on a beach in England, UK (the dog was accustomed 
to collar-testing experiments). The position of each footfall was determined using 
Survey grade GPS (OEM4, Novatel). Dual frequency Doppler and pseudorange 
and phase GPS data were post-processed relative to a local base station data using 
Waypoint GrafNav 8.10 (Novatel) with a horizontal accuracy of 20mm. The 


timing of each footfall was determined from simultaneous high-speed video at 
500 frames per second (f.p.s.) (X-Pri 1280 * 1024 AOS Gmbh). The camera trigger 
event was captured via an interrupt channel on an RVC GPS logger module with 
sub-millisecond accuracy, and used to express footfall events in GPS time for 
comparison to collar data (Supplementary Fig. 2e). The four footfalls per stride 
were easily identified in the position data (Supplementary Fig. 2a, b), and the 
distance between subsequent non-lead forefootfalls was defined as stride length, 
and the time between those foot falls as stride duration. Stride duration by video 
and by processing of collar data was compared by subtracting stride time from foot 
falls on high-speed video from stride duration from collar data and plotting the 
difference as a histogram (Supplementary Fig. 2f). Speed was calculated by dividing 
stride length by stride duration, and data were smoothed with a three-stride centre 
weighted rolling average as described for the collar data and the results plotted 
(Supplementary Fig. 2d). These data show that qualitatively the collar reproduces 
the track of the footfalls and that the speed time (and hence acceleration) data are 
indistinguishable between the two approaches. Further trials and analysis are 
required for a full assessment of the two methods. 
Statistics. To establish which aspects of a run correlate with success, GLMMs 
were performed in R statistical software (R, version 2.14.1, 2011. R Development 
Core Team 2011, Foundation for Statistical Computing, Vienna, Austria). In the 
model, all the descriptive parameters of each hunt (terrain, distance, top speed, 
peak acceleration and deceleration number of turns and total turn angle) were 
included as fixed effects. To control for individual variation, a subject was included 
as a random effect. If an effect was not significant, and removing it from the model 
improved the Akaike information criterion (AIC), then it was removed. A chi- 
squared test was used to evaluate the effect of terrain on outcome. 
Human acceleration power. Ten-metre split times for the 9.58s world 100-m 
record run by Usain Bolt in 2009 were retrieved from the IAAF website (http:// 
berlin.iaaf.org/records/biomechanics/index.html). A fifth order polynomial was 
fitted through the distance-time data. This polynomial visually fitted the data 
points and was differentiated to give formulae for speed and acceleration through 
the race and a function for instantaneous power through the race calculated as the 
product of the functions for speed and acceleration. This gave a peak centre of 
mass power of 25 Wkg | body mass at 7ms_', which is similar to previously 
published values for human sprinters’”. 
Hunting, terrain and outcome (success). Runs were identified in activity sum- 
maries by very high-peak acceleration amplitudes in all three axes, but particularly 
high accelerations in the cranio-caudal direction were the best indicator, con- 
firmed from GPS speed where present. If two run events were within 10 min of 
one another, they were considered to be the same event for outcome measures. 
Terrain was determined from Google Earth; georeferencing of known landmarks 
and road junctions was confirmed to be accurate to within 5 m in the study area. 
We identified feeding as a consistent signal on all three accelerometer axes 
(mean amplitude similar to mean of mean amplitudes), with particularly low 
cranio-caudal accelerations (compared with walking) and no change in location. 
See ref. 13 for more discussion. We classified a run as a successful hunt if 6 min of 
this feeding behaviour occurred in the 30 min after a run was identified. These 
methods correctly identified nine out of the ten known successful hunts using only 
the activity data (that is, without using GPS data), and correctly identified all nine 
as successful hunts. When applied to the main data set, the classification outcome 
correlated to other markers of success in 97% of known hunts. The other markers 
were: prey struggling captured in the accelerometer signal; cheetah remaining at 
hunt location for over two hours after the run; observing the cheetah on a kill. 
List of symbols. i, stride number; P;, two-dimensional position; A0;, signed 
change of heading; «;, heading angular velocity; AT, sampling interval; a;, hori- 
zontal acceleration; a, ;, tangential or forward acceleration; a,;, centripetal accel- 
eration; r;, instantaneous turn radius; v; stride averaged horizontal speed; K, 
whole-body power; k; mass-specific whole-body power; 5; parameter to be 
weighted; S;,,,, parameter after weighting; 1, coefficient of friction; m, body mass; 
g, acceleration due to gravity. 
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X-ray structure of the mammalian 
GIRK2-By G-protein complex 


Matthew R. Whorton!? & Roderick MacKinnon!” 


G-protein-gated inward rectifier Kt (GIRK) channels allow neurotransmitters, through G-protein-coupled receptor 
stimulation, to control cellular electrical excitability. In cardiac and neuronal cells this control regulates heart rate and 
neural circuit activity, respectively. Here we present the 3.5 A resolution crystal structure of the mammalian GIRK2 
channel in complex with py G-protein subunits, the central signalling complex that links G-protein-coupled receptor 
stimulation to K* channel activity. Short-range atomic and long-range electrostatic interactions stabilize four By 
G-protein subunits at the interfaces between four Kt channel subunits, inducing a pre-open state of the channel. The 
pre-open state exhibits a conformation that is intermediate between the closed conformation and the open 
conformation of the constitutively active mutant. The resultant structural picture is compatible with ‘membrane 
delimited’ activation of GIRK channels by G proteins and the characteristic burst kinetics of channel gating. The 
structures also permit a conceptual understanding of how the signalling lipid phosphatidylinositol-4,5-bisphosphate 
(PIP) and intracellular Na‘ ions participate in multi-ligand regulation of GIRK channels. 


In 1921, Otto Loewi established the existence of chemical synaptic 
transmission by showing that vagus nerve stimulation slows the heart 
rate through release of a chemical substance he called vagusstoff'’. 
Vagusstoff was later shown to be acetylcholine, the major neurotrans- 
mitter of the parasympathetic nervous system**. Once released from 
the vagus nerve, acetylcholine binds to the M2 muscarinic receptor, a 
G-protein-coupled receptor (GPCR) in heart cell membranes, and 
causes the release of G-protein subunits Ga and GBy from the recep- 
tor’s intracellular surface’. The GBy subunits activate GIRK channels, 
causing them to open*"°. Open GIRK channels drive the membrane 
voltage towards the resting (Nernst K* ) potential, which slows the rate 
of membrane depolarization, as depicted (Fig. 1a). In atrial pacemaker 
cells of the heart, this directly decreases firing frequency and thus heart 
rate''. Isoforms of the GIRK channel also exist in neurons, which permit 
G-protein-mediated regulation of neuronal electrical excitability’. 

For several decades, electrophysiological and biochemical methods 
have been applied to understand how G-protein subunits activate 
GIRK channels. Specific mutations on the GBy subunit'*” and on 
the channel'*’” were shown to alter G-protein-mediated activation of 
GIRK channels. Biochemical and NMR studies identified components 
of both the G protein and channel that appear to interact with each 
other’. Together these studies point to a direct interaction between 
the G-protein subunits and the channel to achieve channel activation. 
Here we present the crystal structure of a GIRK channel bound to GBy 
subunits, a key signalling complex in the G-protein-mediated control 
of electrical excitability. 


GIRK2 activation by G-protein subunits 


Our study addresses mouse GIRK2 (Kir3.2; inward rectifier K* channel 
(Kir)), a neuronal GIRK channel that is able to function as a tetramer of 
identical subunits”*. Activation of GIRK2, which hereafter we refer to 
as GIRK, by GPCR stimulation is shown using an assay in which the 
M2 muscarinic GPCR is co-expressed together with GIRK channels in 
Xenopus laevis oocytes” (Fig. 1b, left). Initial replacement of Na* by 
K* in the extracellular solution causes some current to flow into the 


oocyte, measured using two-electrode voltage clamp. When acetylcholine 
is then applied, a larger inward K* current is turned on. Inhibition of 
current by tertiapin-Q, a bee venom toxin derivative that is known to 
inhibit the GIRK channel but not endogenous oocyte channels, estab- 
lishes the current as mediated by the GIRK channels, a fraction of 
which are active in the absence of acetylcholine”. The fraction of 
current activated by acetylcholine is variable, depending on the oocyte. 
Isolated membrane patches show the characteristic gating of single 
GIRK channels (Fig. 1b, right). These channels display ‘burst kinetics’, 
during which time an activated channel flickers rapidly between con- 
ducting (open) and non-conducting (closed) states, a property we will 
consider later. These electrophysiological recordings and other func- 
tional studies were carried out with the same construct used for crys- 
tallization and structural analysis. Hereafter, we refer to this construct, 
which consists of residues 52-380, as the wild-type channel. We 
emphasize that removal of the disordered amino and carboxy termini 
does not appear to alter the functional properties of the channel in any 
of the electrophysiological and flux measurements we made. 

All studies of G-protein-mediated GIRK channel activation to date 
have been carried out with native cells or with cell lines in which 
components were expressed heterologously (as in Fig. 1b). Having 
obtained individual isolated components—the GIRK channel, GBy 
subunits and the signalling lipid PIP,—we tested whether these alone 
(that is, in the absence of other cellular components) are sufficient to 
produce a competent signalling complex. Using a flux assay in which 
the isolated components are reconstituted into synthetic lipid vesicles, 
we find that they are indeed sufficient: baseline K* flux observed in 
the absence of Gy is strongly enhanced in the presence of Gy (Fig. 1c). 
GIRK is also activated by intracellular Na*, which accounts for the 
greater flux observed in Na* than in N-methyl-D-glucamine (NMDG") 
(refs 26-29). However, even in the presence of Na’, GBy still causes 
significant enhancement of flux. These measurements with purified, 
reconstituted components confirm the conclusion reached through 
electrophysiological studies, that GBy in the presence of membranes 
containing PIP, is sufficient to increase the open probability of GIRK 
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Figure 1 | Functional properties of the channel. a, Schematic of GPCR 
activation of GIRK channels. Agonist binding to a GPCR promotes the 
exchange of GDP for GTP on a bound G protein. This causes the G protein to 
dissociate from the receptor. The Ga and GBy subunits subsequently dissociate 
from each other and they can then interact with effector proteins. GBy binding 
to the cytoplasmic domain of a GIRK channel in the presence of PIP, causes the 
channel to open. GIRK channels are also activated by elevated levels of 
intracellular Na* ions. b, Example of GPCR-activation of GIRK. The truncated 
GIRK construct used for crystallography was co-expressed with the M2 
muscarinic receptor in X. laevis oocytes. Whole-cell current was measured 
using two-electrode voltage clamp while holding the cell at —-60 mV. The white 
bars indicate a physiological extracellular solution, whereas the grey bars 
represent a solution containing 98 mM KCL. The application of 10 uM 
acetylcholine (ACh, a M2R agonist), or 1 uM of tertiapin-Q (TPN-Q, a specific 


channels. These experiments do not exclude a possible role for the Ga 
subunit in regulating the GIRK channel or in conferring G-protein 
specificity, that is, explaining why GIRK normally is activated by 
GBy subunits associated with ‘inhibitory’ Ga;,, subunits and not by 
Gy subunits associated with stimulatory Ga, subunits®. Although 
many questions remain, these reconstitution experiments show that 
Gy by itself is sufficient to activate the GIRK channel. 


Role of membrane in complex formation 


Efforts to purify a stable GIRK-GBy protein complex in detergent 
solutions were unsuccessful. We therefore attempted to grow crystals 
of the complex in dodecylmaltoside by combining individually purified 
GIRK and Gfy proteins at a twofold to threefold molar excess of GBy 
in the presence of a tenfold molar excess of PIP2. Crystals containing 
both GIRK and GBy grew and diffracted to 3.5A resolution. These 
were of space group 1422 with one GIRK monomer and one GBy com- 
plex per asymmetric unit. Phases were solved by molecular replace- 
ment using previously determined structures of GIRK and Gy as 
search models***’. A model of the complex was built and refined to 
working and free residuals, R, and R, of 22.8% and 26.5%, respectively 
(Supplementary Table 1). The biological unit consists of one channel 
tetramer, four GBy subunits, four PIP; molecules and four Na‘ ions 
bound to regulatory sites in addition to K™ ions in the selectivity filter. 
Intracellular Na*, PIP; and Gfy are all physiological regulators of 
GIRK channel gating’ °°”. 

The arrangement of protein molecules within the crystal lattice is 
notable in light of a functional phenomenon known as ‘membrane 
delimited’ activation of GIRK channels by G-protein stimulation 
(Supplementary Fig. 1a, b). Electrophysiological studies showed that 
on their way to reaching the channel, GBy subunits behave as if to 
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GIRK2 blocker) is also indicated. The traces under the dashed line represent 
negative, inward currents. Single-channel recordings in the on-cell patch- 
clamp configuration (right). The patch pipette contained 96 mM KCI and 

10 pM ACh. The patch was held at —100 mV. A closeup of one of the burst 
openings is shown beneath. c, Activation of purified GIRK channels 
reconstituted into lipid vesicles. Channel activity was monitored in the presence 
of either NMDG-Cl or NaCl using a fluorescence-based assay described in 
detail in the Methods section. Purified GBy was added to some of the samples in 
either the NMDG-Cl or NaCl buffers, as indicated. The initiation of K* flux by 
the addition of the H* ionophore carbonyl cyanide m-chlorophenyl hydrazone 
(CCCP) is indicated. The addition of the K* ionophore valinomycin to 
measure total flux capacity of the vesicles is also indicated. The dashed lines 
represent the same experimental conditions, except that the vesicles do not 
contain any GIRK. 


diffuse while attached to the membrane’s cytoplasmic surface (that is, 
membrane delimited)*""°. In the crystal we observe pseudo-membrane 
layers consisting of transmembrane channel domains (TMDs) and 
aqueous layers consisting of cytoplasmic channel domains (CTCDs) 
and GBy subunits (Supplementary Fig. la, b). The GBy subunits are 
oriented such that the C terminus of the Gy subunit, which contains a 
covalent lipid molecule, a geranylgeranyl group, is pointed directly at 
the membrane layer as if to function as an anchor (Fig. 2a)****. We note 
that a similar arrangement of the G-protein subunits was observed in 
the crystal of the B,-adrenergic GPCR in complex with the GaBy 
heterotrimer, which was determined in lipid cubic phases**. Thus, 
the GIRK-Gfy crystal is compatible with physiological membrane 
delimited GBy activation of GIRK. We also note that our ability to 
achieve a complex in a crystal with membrane-like layers, but not in 
detergent solutions, implies that a membrane is important in the 
formation of the complex between GIRK and GBy. 


The protein complex 
Two views of the GIRK channel show the gating regulators Na”, PIP, 
and GBy subunits bound (Fig. 2a, b). On the extracellular side of the 
membrane, the channel’s turrets, which surround the pore entryway, 
project approximately 10 A beyond the membrane surface. Previously 
we speculated that GIRK channels are more susceptible to pore-blocking 
toxins than are some other Kir channels, because the turrets are more 
widely spaced in GIRK and thus allow toxins to access the pore*’. The 
structure of the turrets in this crystal are better defined than in the 
previous GIRK structures*’, and indeed support this hypothesis (Sup- 
plementary Fig. 2). 

On the intracellular side, large CT'CDs project beyond the mem- 
brane surface, approximately 50 A into the cell (Fig. 2a). These domains 
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Figure 2 | Overall structure of the GIRK-Gfy complex. a, A side view of the 
GIRK (blue), GB (red) and Gy (green) complex. The front GBy dimer was 
removed for clarity. The approximate extent of the phospholipid bilayer is 
shown by the black lines. The ‘gg’ label indicates the geranylgeranyl lipid 
modification at the C terminus of Gy. Bound Na” ions are shown as purple 
spheres, the PIP, molecules are shown as sticks and the K* ions as green 
spheres. b, Top-down view of the complex from the extracellular side of the cell. 


provide an extensive surface through which molecules inside the cell 
can bind to regulate channel gating. The GBy subunits interact directly 
with the CT'CDs through GB and to the membrane through the cova- 
lent lipid attached to Gy. Although we do not see the lipids in the crystal, 
the inferred covalent lipid interaction with the membrane is depicted. 
The entire GIRK channel-Gy complex forms a 120 A X 120 A square 
against the intracellular surface of the membrane (Fig. 2b). 
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The contact surface between GIRK and Gf is approximately 700 a 
(Fig. 3a-c). On GIRK the contact surface is formed by secondary 
structure elements BK, BL, BM and BN from one channel subunit and 
by elements $D and BE from an adjacent channel subunit (Fig. 3d). The 
occurrence of the binding site at the interface between two channel 
subunits is likely to be important for mechanistic reasons, discussed 
below. On Gf the contact surface is formed by B-sheet elements that 
form blades 1 and 7 on one edge of the B-propeller (Fig. 3c, e). 

We compared the contact surfaces observed in the crystal structure 
with inferences drawn using other biophysical methods. Transferred 
cross saturation and chemical shift perturbation NMR experiments 
have been used to identify amino acids on the GIRK CTCDs that 
interact with GBy or change upon its binding”. These amino acids, 
coloured purple (and orange for L344), fall mainly within or near the 
perimeter of the surface on GIRK that contacts GBy in the crystal 
structure (Supplementary Fig. 3a). Mutational studies also identified 
numerous amino acids that affect GIRK activation by G-protein 
stimulation'*"’. These amino acids are coloured orange on the surface 
of GIRK and GBy (Supplementary Fig. 3a, b). There are some outliers 
that may influence function indirectly or alternatively may disrupt 
protein structure, but most lie within or near the GIRK-GBy contact 
surface. Thus, both NMR and mutagenesis studies lend support to a 
biologically relevant signalling complex formed in the crystal structure. 


Molecular determinants of GBy binding 


All members of the Kir channel family share the same general molecular 
architecture, but, as far as we know, only the GIRKs (Kir3 proteins) are 
directly regulated by G-protein subunits**. Many amino acids that 
compose the GBy binding surface on GIRK are also conserved among 
the G-protein-independent Kir channels, but a small set are unique to 
GIRK (Supplementary Fig. 4). This unique set includes Gln 248 and 
Phe 254 on the BD-BE loop and the sequence Leu-Thr/Ser-Leu (342- 
344) on the BL-BM loop (Fig. 3f). Gln 248 forms contacts with Gln 75, 
Ser 98 and Trp 99 on the GB subunit and mutations at Ser98 and 
Trp 99 diminish GBy activation of GIRK (Fig. 3f-h)!*"*. The Leu- 
Thr/Ser-Leu sequence contacts Leu 55 and Lys 78 on Gf (Fig. 3f, g, i). 
Mutations involving these residues also affect GBy activation of GIRK”. 
Thus, we can begin to understand GBy recognition in terms of short- 
range interactions afforded by a relatively small set of residues on the 
surface of the GIRK channel to which the GBy subunits bind. 

Long-range electrostatic interactions between GIRK and Gfy also 
appear significant (Fig. 4a, b). Several acidic residues on the BL-BM 
loop of the GIRK CTCDs complement an electropositive swath on 
the binding surface of the GB subunit (Fig. 4a, b, d). By comparison, a 
G-protein-independent Kir channel contains several lysine amino 
acids on its BL-BM loop that render its surface potential less electro- 
negative (Fig. 4c, e). Thus, electrostatic complementarity probably 
plays a role in binding affinity and specificity. In addition, by acting 
over longer distances, electrostatic forces are able to guide diffusing 
molecules into the formation of an encounter complex, where short- 
range interactions are then able to take hold*’. Such long-range guid- 
ance would seem to make sense here by directing the diffusion of GBy 
to the K* channel once it is released from an activated GPCR. 

Charged lipids on the membrane’s inner leaflet dominate the electric 
field close to the membrane where G-protein signalling occurs**. With 
this in mind, Fig. 4f illustrates another potentially important role for 
electrostatic interactions between GIRK and GBv. In the .-adrenergic 
GPCR-GaBy complex, the GBy subunits appear oriented to maximize 
positive protein charge contact with negative charges on the mem- 
brane surface**’. In the GIRK-GBy complex, although GBy resides at 
the same level with respect to the plane of the membrane, it is tilted 
roughly 35°. The tilt should reduce favourable electrostatic interactions 
between GBy and the membrane, but should produce new favourable 
interactions with GIRK to compensate. Thus, favourable electrostatic 
interactions between GIRK and GBy may help to reorient GBy with 
respect to the membrane. 
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Figure 3 | The GIRK-Gpy binding interface. a, Surface representation of the 
GIRK-Gfy complex. The binding site on GIRK is coloured yellow and the 
binding site on GBy is coloured cyan. The front GBy dimer is removed for 
clarity. The overall orientation is the same as in Fig. 2a. b, A 90°-rotated view of 
a GBy dimer from panel a to more clearly show the binding interface. c, The 
GBy dimer is rotated upwards to orient the central axis of the §-propeller 
orthogonal to the page. d, e, A cartoon rendering of the binding interface on 
GIRK (d) and Gf (e). Residues involved in the binding site are coloured yellow 


Supplementary Fig. 5 compares the different contact surfaces that 
GBy uses to interact with other proteins, including Gx and four other 
effector and regulatory proteins, including GIRK. These comparisons sup- 
port three conclusions. First, the B propeller of Gf creates a large sticky 
surface that enables a multitude of unique interactions. Second, the 
GIRK binding site on GBy overlaps the Ga binding site. This obser- 
vation, although anticipated, underscores the necessity of receptor 
activation and G-protein subunit dissociation (into Gx and GBy) to 
achieve channel activation* *. Third, the contact surface of RGS9 (Regu- 
lator of G protein signalling 9) on GB; is essentially non-overlapping 
with that of GIRK on Gf,7>, although a conformational change would 
be required in RGS9 to allow it to bind to a GIRK-GBy complex”. 
RGS9 in the nervous system suppresses the activity of opioid- and 


~J 
Blade 1 : 


on GIRK (d) and cyan on Gf (e), and respectively correspond to the highlighted 
regions in panels a and b. In the d, e and f, g pairs of panels, the binding site can 
be approximately recapitulated by rotating each panel 90° towards each other, 
like making a sandwich. f, g, The same view as in panels d and e, with the 
residues involved in the binding site instead shown as sticks. h, i, A closeup of 
the GIRK-Gy interaction, focused on the DE loop, BK and BN region (h) or 
the LM loop region (i) of GIRK. Selected hydrogen bond and van der Waals 
interactions are shown as dashed lines as a visual aid. 


dopamine-mediated G-protein signalling”. Further studies will be 
needed to determine whether these signalling pathways intersect. 


Gating control by GBy 

What does the complex structure tell us about the regulation of GIRK 
channel gating by the GBy subunits? With the exception of the C- 
terminal half of Gy, which is displaced by a crystal contact, GBy is 
structurally unchanged whether bound to Go or to GIRK. The GIRK 
structure on the other hand is altered by the presence of Gy. Most 
notably the CTCD is rotated about the channel axis 4° anticlockwise 
(viewed from the membrane) relative to the TMD (Fig. 5a, b and Sup- 
plementary Video 1). The CTCD rotation is associated with an unwrap- 
ping and splaying of the right-handed bundle of four inner helices that 
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Figure 4 | The role of electrostatics at the GIRK-Gfy interface. a—c, Surface 
representations are shown for GBy (a), GIRK2 (b) and IRK2 (ref. 47) (c), and 
are coloured according to calculated electrostatic surface potentials (— 100 mV, 
red; +100 mV, blue). The proteins are shown in the same orientation as in 
Fig. 3d-g, except slightly zoomed out. A black outline of a cartoon 
representation of GIRK (or IRK2 in panel c) is overlaid to help the viewer match 
interacting surfaces. d, e, A closeup view of the LM loop region on GIRK2 (d) or 


form the closed ‘inner helix gate’ in the absence of GBy. Four Phe 192 
side chains on the inner helices come together to form the narrowest 
constriction in the closed inner helix gate. These phenylalanine side 
chains are partially disordered in the slightly splayed structure with 
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Figure 5 | GBy-induced conformational changes in GIRK. a-e, The structure 
of wild-type (WT) GIRK in complex with PIP, (Protein Data Bank (PDB): 
3SYA) is shown (grey) and is compared to the wild-type GIRK-GBy complex 
(blue) and the GIRK(R210A) mutant in complex with PIP, (PDB: 3SYQ) 
(green). All structures are aligned by a structurally inert region around the 
selectivity filter at the top of the transmembrane domain to show the relative 
twisting of the cytoplasmic domains. Panels a and c show a Cx ribbon trace of a 
side view of the transmembrane domain, with Phe 192 shown as sticks for 
reference. Panels b and d show a top-down view of the cytoplasmic domain, with 
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IRK2 (e). The Cx atom for aspartate or glutamate residues are shown as red balls, 
and arginines and lysines are shown as blue balls. f, Cartoon representations of 
the GIRK-GBy complex and the B,-AR-Ga, By complex” are shown. The black 
lines highlight the difference in the relative orientation of the two GBy dimers to 
the membrane (grey rectangle) (top). Isocontour representations of the 
electrostatic potential for an isolated GIRK channel (left) ora GBy dimer (middle 
and right) are shown (—25 mV, red; +25 mV, blue) (bottom). 


GBy subunits bound, however, the pathway seems too narrow to con- 
duct hydrated K* ions: other clearly open K* channels, such as the 
Kv1.2 voltage-dependent K* channel, have a minimum diameter of 
10 A (distance between van der Waals surfaces), but in this structure of 
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the degree and direction of twisting indicated. Panel e is a 70° rotated view of 
panel c to highlight the conformational changes of the inner helices of the two 
subunits that bound PIP) in this structure. f, A Co ribbon trace of part of the 
cytoplasmic domain from the wild-type GIRK plus Gy structure. Red colouring 
reflects the root mean squared deviation (r.m.s.d.) between the wild-type GIRK 
and wild-type GIRK plus Gy structures when they are aligned by their 
cytoplasmic domains. This highlights the additional conformational changes 
that happen in the cytoplasmic domain apart from the rigid-body twisting 
shown in panels b and d. The most intense red represents a r.m.s.d. of 0.8 A. 
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GIRK2 it is only 6-7 A*“°. However, the single-channel recordings 
suggest perhaps we should not expect an open conformation in the 
crystal structure (Fig. 1b). G-protein subunits are most probably bound 
to the GIRK channel during the duration of an activity burst; however, 
during the burst the channel flickers rapidly with a relatively low open 
probability. This might suggest that the GIRK structure we observe in 
the presence of Gy, which adopts a distinctly different conformation 
than the structure without GBy, represents a G-protein-activated, pre- 
open conformation, corresponding to the channel part way along the 
reaction pathway from closed to open. 

Ina previous study, we determined the crystal structure of a constitutively- 
open point mutant of the channel, GIRK(R201A), which is conductive 
in the absence of G-protein stimulation*’. The mutant structure is 
indeed open, and its comparison to the wild-type channels in the ab- 
sence and presence of GBy is suggestive of a mechanism (Figs 5c-e 
and 6). In the mutant channel, the CTCD is rotated an additional 4° 
beyond the rotation caused by GBy, and the CTCD subunits have 
undergone an internal conformational change associated with widen- 
ing of the membrane-facing apex of the CTCD. This widening further 
opens the inner helix gate to a diameter of 9 A. One caveat is that only 
two PIP, molecules are bound to the tetramer in the mutant channel— 
to diagonally opposed subunits—so that opening is twofold rather than 
fourfold symmetric. Packing in the mutant crystal appears to have 
prevented the binding of PIP, molecules to all four subunits, which 
is observed in the wild-type structures*’. We suspect that had four PIP, 
molecules bound to the mutant channel, then opening would be sym- 
metric. Despite the asymmetry of GIRK(R201A), the conformation of 
GIRK in the GBy complex is clearly intermediate between the closed 
(GBy-free) and opened GIRK(R201A) structures. A morph between 
these conformations shows that binding of GBy causes a 4° rotation of 
the CTCDs and a slight splaying of the inner helices. The GIRK(R201A) 
mutation produces a further 4° rotation, a conformational change 
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within the CTCD subunits, and an opening of the inner helix gate 
(Supplementary Video 2). These conformations could account for 
the burst kinetic behaviour of single GIRK channels if binding of the 
GBy subunits produces a pre-open conformation in the membrane, 
from which the channel flickers rapidly between open (conductive) and 
pre-open (nonconductive) conformations (Fig. 6, highlighted path- 
way). This hypothesis would predict that the GIRK(R201A) mutant 
channels should exhibit a higher open probability. Unfortunately, due 
to reduced expression levels, we have been unable to characterize the 
single channel behaviour of this mutant in either X. laevis oocytes or 
Chinese hamster ovary cells. 


Discussion 

GIRK2 channels are regulated by PIP2, G-protein subunits and intra- 
cellular Na* ions>-*?52932, We show in reconstitution experiments, 
using purified components, that these regulators individually acti- 
vate the channel partially, and in combination activate it to a greater 
extent. We present a crystal structure of GIRK with all three regulators 
bound. Together with previously determined crystal structures of 
GIRK and an R201A mutant of GIRK, both determined in the presence 
and absence of PIP, we have pieced together a structural description of 
conformational states that might underlie the sequential activation of 
GIRK channels (Fig. 6, highlighted pathway). The binding of GBy 
subunits to GIRK causes a rotation of the CTCDs with respect to the 
TMD and a partial splaying of the inner helices. This conformation is 
intermediate between the closed and GIRK(R201A) open conforma- 
tions. Full opening is associated with a further rotation of the CTCDs 
and splaying open of the inner helical gate. 

Together the structures permit conceptual explanations for multi- 
ligand regulation. PIP is required for full gate opening in the GIRK 
(R201A) mutant channel*’. Thus, PIP, seems to play a facilitative role; 
under conditions that favour opening, PIP, helps, presumably by 


Figure 6 | A model of gating regulation of GIRK 
channels. The blue shapes depict a GIRK channel 
with a selectivity filter (a) and two gates: the inner 

helix gate (b) and the G loop gate (c). The structures 
correspond to wild-type GIRK without PIP, or GBy 


(PDB: 3SYO), wild-type GIRK with PIP2 only (PDB: 
3SYA), GIRK(R201A) without PIP, or GBy (PDB: 
3SYP), wild-type GIRK with PIP, and GBy (PDB: 
4KFM), and GIRK(R210A) with PIP, (PDB: 3SYQ). 
Circular arrows with degrees indicate CTCD 
rotation about the pore axis with respect to the 
TMD, relative to wild-type structures without GBy. 
Curved arrows in GIRK(R201A) with PIP, reflect 
the outward rocking of CTCD subunits observed in 
this structure. Idealized single-channel recordings 
are shown on the right (expanded time scale on the 
bottom) to illustrate our current hypothesis 
regarding the gating transitions that the channel 
undergoes. Inter-burst periods correspond to a 
channel with only PIP; bound (top); bursts (grey 
bars) correspond to the channel with PIP; and GBy 
bound, which fluctuates rapidly (indicated by 
dashed arrows) between non-conducting (right) and 
conducting (bottom) conformations. 
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strengthening the interface between the CTCD and TMD where PIP, 
is bound. The Na* ion is bound to the CTCD at a position that 
undergoes a conformational change when the channel opens. Thus, 
we should expect Na* binding to be thermodynamically coupled to 
channel gating, allowing Na‘ to function as a regulator. 

Concerning stoichiometry, we do not know how many Gy sub- 
units are required to open the GIRK channel, but based on the structure 
we are compelled to speculate. GBy binding causes a rotation of the 
CTCD associated with splaying of the inner helices to open the gate. 
The rotation no doubt occurs because GBy binds at the interface 
between two adjacent CTCD subunits, which produces detectable rela- 
tive motions of the subunits, and inferred strain between them (Fig. 5f 
and Supplementary Video 3). This, we believe, is the source of the 
rotation. One GBy subunit causing strain across one of four interfaces 
is probably not enough. Four is undoubtedly better. Can a single GPCR 
near a GIRK channel catalyse a sufficient number GDP to GTP 
exchange reactions and release GBy subunits to activate the channel 
or do several GPCRs surround a GIRK channel, and if the latter, are 
GIRK channels and GPCRs randomly distributed or organized in a 
stoichiometric cluster or array? The GIRK-GBy complex provides a 
starting point for addressing these questions. 


METHODS SUMMARY 


A truncated GIRK2 construct (containing residues 52-380) was expressed and 
purified from Pichia pastoris cells as previously described*’, except that the deter- 
gent dodecyl-B-D-maltopyranoside (DDM) was used instead of decyl-B-p- 
maltopyranoside (DM). Full-length human Gf, and Gy. were co-expressed in 
High Five insect cells, extracted with cholate, purified by Talon metal affinity 
chromatography in anzergent 3-12, and then further purified by ion exchange 
and size exclusion chromatography in CHAPS before adding 20% glycerol and 
freezing in liquid nitrogen. Aliquots were thawed as needed and the detergent was 
exchanged to DDM while bound to a HiTrap ion exchange column. Individually 
purified GIRK and GBy were combined along with C8-PIP, at 200 uM, 400 nM 
and 2mM respectively and allowed to incubate for 2h at room temperature 
before proceeding with crystal trials. Crystals of the complex were grown by 
the hanging drop vapour diffusion method at 20° C using 600 mM Nak tartrate 
and 50mM ADA pH5.7-5.9 as the crystallization solution. Crystals were cryo- 
protected by briefly transferring to solutions containing 2.35-2.4 M Nak tartrate. 
The structure was solved by molecular replacement, using previously determined 
structures of both GIRK*! and GBy* as search models. The resulting model was 
improved by iterative rounds of refinement and manual adjustment. Electro- 
physiology recordings of GIRK channel function were performed by expressing 
GIRK in X. laevis oocytes and using two-electrode voltage clamp to measure 
whole cell currents, or on-cell patch clamp to measure single-channel activity. 
In vitro assessment of GIRK function was performed by reconstituting purified 
GIRK into lipid vesicles and then measuring GIRK activity under a variety of 
conditions using a fluorescence-based assay. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Molecular Biology. A truncated GIRK2 construct (containing residues 52-380) 
was cloned into the pPICZ, or pGEM vectors for high-level expression or electro- 
physiology, respectively, as previously described’. The full-length human G- 
protein 8, and 72 subunits were cloned into pFastbac vectors. The yz construct 
also included an N-terminal His10 tag, followed by a yellow fluorescent protein 
(YFP), and then a PreScission protease site (LEVLFQ/GP). Individual baculo- 
viruses were made from these pFastbac vectors using the Bac-to-Bac system 
(Invitrogen). 

Protein expression and purification. GIRK2 was expressed in P. pastoris 
as previously described*'. GIRK2 was extracted and purified from P. pastoris 
cells essentially as previously described*', with a few exceptions: Dodecyl-B-p- 
maltopyranoside (DDM) was used in all steps instead of decyl-B-b-maltopyranoside 
(DM). DDM was used at 4% for extraction, 0.4% during the Talon purification, and 
0.05% on the Superdex-200 column. 10 mM imidazole was included during the 
batch binding to the Talon resin. PreScission protease-cleaved protein was loaded 
onto the Superdex-200 column at a sufficiently high concentration such that the 
1 ml peak was at least 1mgml *. This was done to reduce the final detergent 
concentration in the concentrated protein, which was necessary for growing large, 
thick crystals. The protein was concentrated to 30-40 mg ml“! ina 50K MWCO, 
and was typically used immediately. 

The G proteins were expressed in High Five insect cells (Invitrogen). High Five 
cells were grown at 27 °C in Express Five serum-free media (Invitrogen), supple- 
mented with L-glutamine. The cells were grown to a density of 1-2 million cells 
per ml and then infected with a volume of baculovirus for each protein empirically 
determined to give optimal expression (~30 ml). After 48 h, the cells were harvested 
by centrifugation at 4,000g for 15 min. The cell pellets were resuspended in a small 
volume of the supernatant and this slurry was transferred to 50 ml conical tubes 
(approximately one litre of cells per 50 ml tube). After another centrifugation at 
4,000g for 15 min, the supernatant was removed and the cell pellets (~15 ml) were 
frozen in liquid N, and then stored at —80 °C until needed. 

A typical GBy preparation involved purifying protein from 8 litres worth of 
cells. All procedures were performed at 4 °C unless indicated. Frozen cell pellets 
were added to 480 ml of room temperature buffer comprised of 50 mM HEPES, 
pH 8; 65 mM NaCl; 1 mM EDTA; 5 mM B-mercaptoethanol (BME) and protease 
inhibitors (0.1mgml~' pepstatin, 1mgml' leupeptin, 1 mg ml’ aprotinin, 
0.1 mg ml ' soy trypsin inhibitor, 1 mM benzamidine and 1 mM phenylmethyl- 
sulfonyl fluoride). This solution was stirred at room temperature in a stainless 
steel beaker until the pellets melted. Then the beaker was transferred to ice and the 
solution was sonicated for six times 1 min each using a probe sonicator (Branson), 
with a 1 min cool down in-between. The lysed cells were then spun at 35,000 g for 
35 min to pellet the membranes. The supernatant was poured off and the pellets 
were each briefly rinsed with ~5 ml of 50mM HEPES, pH 8; 50mM NaCl; 
0.1 mM MgCl; 5mM BME and protease inhibitors. The pellets were then resus- 
pended in the same buffer using a dounce homogenizer to a final volume of 
350 ml. Na-cholate was added to a final concentration of 1.5%, and the solution 
was stirred for 40 min. The solubilized membranes were spun again at 35,000 g for 
35 min to pellet insoluble material. The supernatant was diluted with two volumes 
of dilution buffer (20 mM HEPES, pH 8; 300mM NaCl; 5mM BME; 7.5mM 
imidazole; 0.5% anzergent 3-12 (Anatrace) and protease inhibitors) and then 
added to 32 ml of Talon resin (Clontech) pre-equilibrated in dilution buffer. 
This suspension was stirred for 1h, then spun at 1,000g for 5 min. The resin 
was transferred to a column and washed with 4 column volumes (cv.) of 20 mM 
HEPES, pH 8; 300 mM NaCl; 5mM BME; 5 mM imidazole and 0.5% anzergent 
3-12; then, 4 cv. of 20mM HEPES, pH 8; 50mM NaCl; 5mM BME; 10mM 
imidazole and 0.5% anzergent 3-12. Then the protein was eluted from the column 
with 20 mM HEPES, pH 8; 40mM NaCl; 5mM BME; 200 mM imidazole and 
0.5% anzergent 3-12. Dithiothreitol (DTT) and EDTA were added to 5 mM and 
1 mM, respectively. PreScission protease was also added at 1:20 protease:total 
protein, and incubated overnight. The next day, an additional amount of Pre- 
Scission (1:40 protease:total protein) was added and incubated at room temperature 
for 2 h. This solution was then diluted down to a conductivity of ~5 mS cm”! with 
20 mM HEPES, pH 8; 5 mM BME; 1% anzergent 3-12. A white precipitate usually 
formed at this step, which was mostly comprised of contaminants. This was pelleted 
by a brief centrifugation, and the supernatant was further filtered through a 0.22- 
um filter before loading onto a Mono Q 16/10 column, equilibrated with buffer A 
(20 mM HEPES, pH 8; 40 mM NaCl; 5 mM BME and 0.7% 3-[(3-cholamidopropyl)- 
dimethylammonio]-1-propane sulphonate/N,N-dimethyl-3-sulpho-N-[3-[[30,5B, 
70,12«)-3,7,12-trihydroxy-24-oxocholan-24-yl]amino]propyl]-1-propanaminium 
hydroxide (CHAPS; Anatrace). The column was washed with 15 cv. buffer A, then 
the protein was eluted with a 50 cv. gradient from 0-20% buffer B (buffer A with 
1M NaCl). The Gfy protein eluted as a major peak as well as several minor peaks 
that were assumed to be unprenylated or differentially phosphorylated species. 


The major peak was collected and concentrated in a 30K MWCO concentrator to 
at least 5mg ml‘. This was then run on a Superdex-200 column equilibrated in 
20 mM Tris, pH 7.5; 100 mM KCl; 5 mM DTT; 0.7% CHAPS) in multiple runs of 
~2.5 mg protein per run. This helped to remove trace smaller molecular weight 
GBy protein, which was assumed to be an unprenylated species. Peak fractions 
were again concentrated in a 30K MWCO concentrator to ~10 mg ml’. Glycerol 
was added to 20% and the protein was frozen in liquid N> in 150 pl aliquots and 
stored at —80°C until needed. Approximately 8-10 mg worth of these aliquots 
were thawed as needed and the detergent was then exchanged to DDM while 
bound to a 1 ml Q Sepharose column (HiTrap, GE Healthcare) at room temper- 
ature. This was done to ensure complete detergent exchange as well as to get a high 
protein concentration without a high detergent concentration, which was neces- 
sary for growing large, thick crystals. DDM (anagrade) was added to 1% final from 
a 10% stock. This was then slowly diluted with two volumes of 20 mM Tris, pH 7.5; 
5mM DTT and 0.05% DDM, and then loaded onto the HiTrap column equili- 
brated with buffer C (20 mM Tris, pH 7.5; 30 mM KCl; 5mM DTT and 0.05% 
DDM). The column was washed with 25 ml buffer C, then 3 ml 10% buffer D 
(buffer C with 300 mM KCl), 3 ml 20% buffer D, 9 ml 30% buffer D, then the 
protein was eluted with 100% buffer D. The middle of this elution peak yielded 
about 0.75 ml of 8-10 mg ml’. This was then further concentrated in a 30K 
MWCO concentrator. After one centrifugation spin of the concentrator, the 
protein was diluted with the appropriate volume of 20mM Tris, 5mM DTT, 
including an appropriate concentration of EDTA to bring the KCl concentration 
down to 150 mM and the EDTA concentration up to 1 mM. The protein was then 
further concentrated to 40-50 mg ml ' and stored on ice until needed. 
Crystallization. A typical crystallization experiment involved mixing concentrated 
GIRK2, GBy, and 1,2-dioctanoyl-sn-glycero-3-phospho-(1’-myo-inositol-4’,5'- 
bisphosphate) (C8-PIP2, Avanti Polar Lipids) at a final concentration of 200 uM, 
400 11M, and 2 mM, respectively. The mixture was incubated at room temperature 
for at least 2h before mixing 1:1 (0.2 ul + 0.2 pl) with the crystallization solution 
(600 mM Nak tartrate, 50mM Na-ADA (N-(2-acetamido)iminodiacetic acid) 
pH5.7-5.9). The crystals were grown at 20°C using the hanging drop vapour 
diffusion method. The crystals appeared after a few days and grew as thin square 
plates or plate clusters to full size within a week. The crystals were cryoprotected by 
first adding 1 ul of a solution containing 20 mM Tris, pH 7.5; 150 mM KCl; 1 mM 
EDTA; 0.5% DDM; 1 mM DTT; 720 mM Nak tartrate; 50 mM Na-ADA, pH5.7 
and 1mM C8-PIP, directly to the drop. Crystals were gently broken off of the 
clusters and then briefly transferred to a new solution containing 20 mM Tris, pH 
7.5; 0.5% DDM; 1mM DTT; 50 mM Na-ADA, pH 5.7; 1 mM C8-PIP, and 2.35- 
2.4M Nak tartrate, depending on the crystal size. The crystals were then flash- 
frozen in liquid No. 

Structure determination. Diffraction data were collected at the Advanced 
Photon Source 23ID-B beamline (A = 1.033 A). Diffraction images were pro- 
cessed with the HKL2000 program suite**. The crystals diffracted anisotropically 
(3.45 X 3.45 X 3.8 A along the a*, b*, and c* axes, respectively), so integrated 
diffraction data were truncated to these diffraction limits using a script available 
from the UCLA Diffraction Anisotropy Server’. The crystals were highly sensitive 
to radiation damage, so data from three sites on one crystal and one site on another 
crystal were scaled together in HKL2000 to form the most complete data set. Rmerge 
and Ry, diffraction data statistics were calculated using the RMERGE program”. 

The structure was solved with Phaser” by sequentially searching for a GIRK2 
monomer (PDB: 3SYA) and then a GB, y2 dimer (PDB: 1GP2, chains B and G). 
Initial rigid-body refinement of the molecular replacement solution with 
REFMAC**® identified the twist between the cytoplasmic and transmembrane 
domains of GIRK2. The model was then further modified in Coot* and refined 
with REFMAC, using jelly-body restraints. The translation/libration/screw (TLS) 
refinement in REFMAC was used in the final rounds of refinement. 

Because of the poor quality of electron density in the K* selectivity filter, distance 
restraints were used during refinement between the K* ions in the selectivity filter 
to constrain their positions based on known properties from high-resolution K* 
channel structures. In the final stages of refinement, a strong electron density 
feature near the interfacial helix of the channel was modelled as a DDM maltose 
headgroup with a five-carbon aliphatic chain. This was based on bilobal shape of the 
density, the location of this density at the presumed boundary of the DDM micelle, 
and the presence of = 15mM DDM in the crystallization condition. PIP, and Na* 
ligands were carried over from the original search model and their presence was 
confirmed using simulated annealing omit maps (Supplementary Fig. 6a, b). 

Comprehensive model validation was performed with MolProbity” (as embed- 
ded within PHENIX”*), with 94.5/5.5% of residues falling within the favoured and 
allowed region of the Ramachandran plot, respectively. Simulated annealing omit 
maps were calculated using PHENIX. Data collection and refinement statistics are 
shown in Supplementary Table 1. All figures were made using PyMOL (http:// 
www.pymol.org). Videos were made in PyMOL using intermediate structures 
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interpolated with the CNS*’* script from the Yale Morph Server®. Electrostatics 
were calculated using APBS and visualized using the APBS plugin in PyYMOL". 
Disordered side chains were added back to the model in the most common rotamer 
conformation to make the calculations more accurate. 

Electrophysiology. Two-electrode voltage-clamp recordings of GIRK2 currents 
in X. laevis oocytes were performed as previously described*'. For patch-clamp 
recordings of GIRK2 currents, X. laevis oocytes were injected with complemen- 
tary RNA for the M2 muscarinic receptor and GIRK2 as previously described. 
The patch pipettes typically had a resistance of 4 MQ and were filled with 96 mM 
KCl; 1.8 mM CaCl; 1 mM MgCl; 10 mM K-HEPES, pH 7.5 and 1 tM acetylchol- 
ine. The bath solution contained 96 mM KCl, 5 mM EGTA and 10 mM K-HEPES, 
pH7.5. The recordings were made in the on-cell configuration and the patch was 
held at —100 mV. The currents were recorded with an Axon Axopatch 200B 
(Molecular Devices), filtered at 1 kHz and sampled at 10 kHz using an analogue- 
to-digital converter (Digidata 1440A, Axon Instruments). pClamp10.1 software 
(Axon Instruments) was used for controlling the amplifier and data acquisition. 
Uninjected oocytes showed no detectable currents. 

Reconstitution into lipid vesicles. Purified GIRK2 was reconstituted into lipid 
vesicles by first mixing chloroform solutions of 1-palmitoyl-2-oleoyl-sn-glycero- 
3-phosphoethanolamine (POPE), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoglycerol 
(POPG), and L-o-phosphatidylinositol-4,5-bisphosphate (PI(4,5)P.; from porcine 
brain; predominant acyl chains are 18:0 and 20:4) at mass ratios of 3:1:0.04, and 
then drying this solution under an argon gas stream. The dried lipid film was 
placed in a vacuum desiccator for a few hours and then resuspended in 20 mM 
K-HEPES, pH 7.35; 150 mM KCI; 1 mM EDTA; 35 mM CHAPS (at 10 mg ml!) 
by incubating for 2h at room temperature with periodic sonication. Purified 
GIRK2 was concentrated to ~2 mg ml’ and then added to 100 iil of the solubi- 
lized lipids, typically at a 1:300 protein:lipid mass ratio, and incubated at room 
temperature for 30 min. Dehydrated 1 ml Sephadex G-50 (Sigma) columns were 
prepared by loading 1 ml of hydrated Sephadex G-50 resin (equilibrated in 20 mM 
K-HEPES, pH 7.35; 150 mM KC]; 1 mM EDTA) on toa small plastic spin column, 
and then briefly centrifuging at 1,500g. The protein-lipid mixture was then gently 
pipetted onto the resin bed and the columns were briefly spun up to 1,000g to 
remove the detergent and form the proteoliposomes. 

Flux assay. 10 tl of vesicles were added to 190 tl of flux buffer (20 mM K-HEPES, 
pH 7.35; 150 mM NaCl; 1 mM EDTA and 2 tM of the pH-sensitive dye 9-Amino- 
6-chloro-2-methoxyacridine (ACMA)) in a 96-well black plate (in some experi- 
ments, the NaCl was replaced with N-methyl-p-glucamine HCl (NMDG-Cl)). 
This creates a K* gradient across the vesicle membrane and a negative potential 
inside the vesicle if there are open K* channels. Fluorescence was monitored 


ARTICLE 


every 5s at 410/490 nm (excitation/emission, 20 nm bandwidth). After a baseline 
was established, the H* ionophore carbonyl cyanide m-chlorophenyl hydrazone 
(CCCP) was added to 1 1M from a 400 1M stock and briefly mixed. This allows 
protons to enter the vesicles, drawn in by the negative potential, which causes 
quenching of ACMA. After 10 min, the K* ionophore valinomycin is added to 
20 nM from a 8 uM stock and briefly mixed. This acts as a shunt to indicate the 
total capacity of the vesicles. All fluorescence time courses were normalized to the 
fluorescence before CCCP addition to account for slight differences in fluor- 
escence between the wells. In some experiments, a concentrated stock of 
CHAPS-solubilized purified GBy was added to the vesicles before diluting them 
in the flux buffer to give a final concentration of 180 nM. 
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Sodium content as a predictor of the advanced 
evolution of globular cluster stars 


Simon W. Campbell’, Valentina D’Orazi!*, David Yong", Thomas N. Constantino!, John C. Lattanzio', Richard J. Stancliffe**, 
George C. Angelou’, Elizabeth C. Wylie-de Boer? & Frank Grundahl° 


The asymptotic giant branch (AGB) phase is the final stage of nuclear 
burning for low-mass stars. Although Milky Way globular clusters 
are now known to harbour (at least) two generations of stars’’, they 
still provide relatively homogeneous samples of stars that are used to 
constrain stellar evolution theory». It is predicted by stellar models 
that the majority of cluster stars with masses around the current turn- 
off mass (that is, the mass of the stars that are currently leaving the 
main sequence phase) will evolve through the AGB phase®’. Here we 
report that all of the second-generation stars in the globular cluster 
NGC 6752—70 per cent of the cluster population—fail to reach the 
AGB phase. Through spectroscopic abundance measurements, we 
found that every AGB star in our sample has a low sodium abund- 
ance, indicating that they are exclusively first-generation stars. This 
implies that many clusters cannot reliably be used for star counts to 
test stellar evolution timescales if the AGB population is included. 
We have no clear explanation for this observation. 

We obtained high-resolution spectra (R ~ 24,000) fora sample of 20 
AGB stars and 24 red giant branch stars in the Galactic globular cluster 
NGC 6752. The spectral coverage included the strong Na I doublet at 
5,680 A. In Fig. 1 we show the stellar sample. We include red giant 
branch stars as a control group, because it has previously been shown 
that this evolutionary population harbours the standard abundance 
distributions, including the well-known Na-O anticorrelation present 
in all globular clusters’. 

Our sodium abundance results are shown in Fig. 2. The red giant 
branch sample shows the usual spread in [Na/Fe] = logio(NNa/Nre) star 
— logio(Nna/Nre)sun Of roughly 1 dex (N, is the number density of 
atoms of each elemental species). On the other hand, the AGB result 
is striking—all the AGB stars in our sample lie at the low end of the red 
giant branch distribution. The upper envelope of the AGB sodium 
abundances is located at about [Na/Fe] = 0.18 dex. This corresponds 
very closely to a previous red giant branch study that defines the Na- 
poor, first-generation population as having [Na/Fe] < 0.2 dex (their 
‘Primordial’ population)’. We find the proportion of Na-poor to Na- 
rich red giant branch stars in our data to be 30:70. This also corre- 
sponds well to the roughly 30:70 proportions found previously’. Thus 
all of our AGB stars appear to be first-generation stars, giving a first 
generation to second generation ratio change from 30:70 in the red 
giant branch population to 100:0 in the AGB population. The range in 
[Na/Fe] in our AGB sample is very small, with a mean of -0.07 dex and 
a standard deviation of 0.10 dex. This scatter is comparable to our 
internal uncertainties (Fig. 2 and Supplementary Table 1), which sug- 
gests that the AGB stars may have a uniform abundance of Na. 

Our results indicate that the entire population of second-generation 
stars, having increased levels of Na, must fail to enter the AGB phase. 
This is a very significant effect, because the second-generation popu- 
lation contains the majority of the stars in NGC 6752 (70%). Although 
two studies have theorized that some AGB stars may not ascend the 


AGB in NGC 6752 (refs 9, 10), this has been based on low-resolution 
cyanogen band strength observations. It is known that these observa- 
tions are affected by many uncertainties'’, including the in situ vari- 
ation of C and N along the red giant branch’*. Measurement of 
elemental Na is more robust, because Na is not affected by molecular 
band formation uncertainties and stars of these masses (~0.8 solar 
masses) cannot alter their Na abundances in situ. In particular, a 
reduction in surface Na would require very high temperatures at the 
base of the convective envelope. This is only achieved in much more 
massive stars (24 solar masses), via ‘hot bottom burning’ nucleo- 
synthesis'*. Thus the result presented here is to our knowledge the 
first conclusive confirmation that stars of certain chemical composi- 
tion do not ascend the AGB. Moreover, we can readily identify which 
stars avoid the AGB on the basis of their Na content. 

An obvious consequence of such a large proportion of stars avoiding 
the AGB is that there should currently be many fewer stars in the AGB 
phase than expected. A detailed study reporting star counts of globular 
cluster populations finds a value of Rz = Nacp/Nup ~ 0.06 for NGC 
6752 (ref. 14; Nagp and Nyy are the number of AGB and horizontal 
branch stars, respectively). This is one of the lowest R2 values in their 
globular cluster sample. The globular clusters with the two highest Ro 
values in their sample (M 5 and M 55) could be assumed to provide an 
upper limit to R:, because R; is fairly insensitive to metallicity, He 
abundance and globular cluster age’*. This upper value is roughly 
0.18, that is, a factor of 3 higher than that of NGC 6752. This is indeed 
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Figure 1 | Sample selection in the Strémgren uvby colour-magnitude 
diagram of NGC 6752. Small black dots show the whole photometric sample’. 
Our AGB and red giant branch (RGB) stellar samples are shown as blue squares 
and red triangles respectively. Part of the horizontal branch can be seen at 
bottom left, at y magnitudes 213.5. 
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Figure 2 | Sodium abundance results for NGC 6752. [Na/Fe] for our sample 
of red giant branch stars (red triangles, 24 stars) and AGB stars (blue squares, 20 
stars) shown against stellar effective temperature T.4 (also see Supplementary 
Table 1). For comparison, the red giant branch results of a previous study 
(C07)* are included (grey open circles). The horizontal dotted line at 

[Na/Fe] = 0.18 marks the upper envelope of AGB values, and divides Na-rich 
and Na-poor stars. The spectroscopic observations were carried out with the 
Very Large Telescope. The FLAMES/Giraffe HR11 grating was used, with a 
spectral coverage 2 = 5,597-5,840 A. Na abundances (assuming local 
thermodynamic equilibrium) were obtained from the strong Na I doublet at 
5,680 A, with the driver ab find in MOOG” (2011 version, available at 
http://www.as.utexas.edu/~chris/moog.html) and the Kurucz model 
atmospheres with no overshooting”’. Stellar parameters were derived in the 
following way: Tp values were calculated from a Strémgren colour (b — y) 
calibration”; gravities were then computed from stellar luminosities and the 
derived temperatures (with assumed stellar mass M of 0.8 solar masses, and 
distance modulus of (m—-M)y = 13.30), while microturbulence values € were 
computed using a relation from the literature”. A metallicity of [Fe/H] = —1.54 
dex (ref. 8) was adopted for all stars. Although the lines under scrutiny are known 
to be not largely affected by departures from local thermodynamic equilibrium at 
these metallicities and temperatures (at most 0.15 dex), we applied non local 
thermodynamic equilibrium corrections” to our Na abundances. Error bars show 
the random (internal) uncertainties (see also Supplementary Table 1). The 
uncertainties were estimated by adding, in quadrature, errors due to the equivalent 
width measurements and those related to stellar parameters. The latter were 
evaluated in the standard way, by varying one parameter at a time and inspecting 
the corresponding variation in the resulting abundances. We adopted errors of 
AT = £30 K, Alogg = 0.1, AE = 0.1 kms, and A[Fe/H] = £0.05 dex. 


consistent with our result of 2 70% of stars not ascending the AGB. 
Current stellar model predictions for Rz tend to be lower than 0.18, 
being around 0.12-0.15 (refs 15, 16); however, the models are known to 
suffer from significant uncertainties’®. We note that the observed R, 
value for NGC 6752 is still at least a factor of 2 smaller than the model 
predictions. 

Studies of the stage of evolution directly preceding the AGB, the 
horizontal branch phase, can shed light on the AGB avoidance phe- 
nomenon. One recent investigation into the Na abundances in a sam- 
ple of horizontal branch stars in NGC 6752 showed that the redder end 
of the horizontal branch (NGC 6752 only has a blue horizontal branch) 
contains only Na-poor stars’” (see Supplementary Information section 
2 for more discussion). This implies that it is the bluer (presumably 
Na-rich) horizontal branch stars that must avoid AGB ascent, leaving 
only the redder, Na-poor horizontal branch stars to populate the AGB 
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Figure 3 | Theoretical stellar model tracks overlain on the Str6mgren 
colour-magnitude diagram of NGC 6752. The solid red line (with open 
circles marking 5 Myr time intervals) is a model with an initial mass of 0.8 solar 
masses and a helium content of Y = 0.245. This Y value matches that reported 
for the redder end of the horizontal branch”. This first-generation model does 
indeed spend most of its horizontal branch evolution at the redder end of the 
horizontal branch. The solid black line (with open squares marking 5 Myr time 
intervals) is a model with an initial mass of 0.75 solar masses and an enhanced 
helium content of Y = 0.285. This second-generation model spends its 
horizontal branch evolution in a bluer part of the horizontal branch, but still 
ascends the AGB, contrary to the observational findings of the current study. 
The solid blue line (with open triangles marking 5 Myr time intervals) shows 
the evolution of the Y = 0.285 model with an ad hoc 20-fold increase in mass 
loss rate (M=dM /dt) initiated once the star settles on the horizontal branch. 
This model evolves downwards along the extreme blue end of the horizontal 
branch and fails to ascend the AGB. The arrow indicates the location of the 
Grundahl jump at y = 14.65 (see text for details). The stellar models were 
calculated using the Monash University stellar structure code MONSTAR”. 
The code has been recently updated with low-temperature opacity tables which 
follow variations in C, N and O”*. The usual Reimers mass loss rate?” was used 
for the red giant branch and horizontal branch (with 7 = 0.48 for the models 
with normal mass-loss). Transformations from the theoretical luminosity—T 
plane to the colour-magnitude diagram have been made”. 


of NGC 6752. Combining this information with our ratio of Na-poor 
to Na-rich stars (Fig. 2), we can estimate the horizontal branch colour 
at which stars fail to ascend the AGB. We find an ‘ascension cut-off 
colour that coincides exactly with the colour for the “Grundahl jump’. 
The Grundahl jump is a discontinuity in horizontal branch morpho- 
logy which is seen in all globular clusters studied to date whose hori- 
zontal branch extends beyond an effective temperature of ~11,500 K 
(refs 18, 19). Thus it appears that all stars bluer than the Grundahl 
jump do not ascend the AGB, at least in NGC 6752. This may represent 
further evidence that there is some fundamental change in the stellar 
atmosphere structure and/or mass-loss physics occurring at the 
Grundahl jump temperature’®. 

We have calculated some stellar models to compare with the pho- 
tometric observations. The model results are shown in Fig. 3. The 
model representative of the first-generation stars populates the redder 
end of the horizontal branch, before the Grundahl jump, as expected. It 
then continues to the AGB. The model representative of the He-rich 
second-generation stars (presumed to correspond to the Na-rich 
population) populates the bluer end of the horizontal branch (after 
the Grundahl jump), and also continues to the AGB. Thus our second- 
generation model cannot account for the lack of ascension of the 
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Na-rich blue horizontal branch stars in this part of the colour- 
magnitude diagram. We note that an increased mass loss rate during 
the red giant branch phase would result in a bluer zero-age horizontal 
branch star (see Supplementary Information section 3 for more detail), 
so this can not be a solution, because the colour-magnitude diagram is 
clearly populated in this region. We speculate that one solution to this 
problem may be that the horizontal branch stars blueward of the 
Grundahl jump experience enhanced mass loss. To test this, we arti- 
ficially increased the mass loss rate in the second-generation model 
during its horizontal branch phase by an ad hoc factor of 20 (Fig. 3). 
Indeed this model can populate the blue end of the horizontal branch 
and also fail to become an AGB star. We note however that this test is 
purely hypothetical and, although this result may provide a starting 
point, a thorough investigation into the reasons behind the discord- 
ance between theory and observation is sorely needed. There is cur- 
rently no clear explanation for such a high proportion of failed AGB 
stars. 

Finally, because globular clusters are often used to test stellar evolu- 
tion theory, the extremely high AGB failure rate reported here will 
affect any test which uses star counts of AGB stars. This is true of the R 
method used to check the lifetimes of various phases of evolution. In 
particular, the R’, R, and R, values**" all involve the number of AGB 
stars, so these values will be flawed (including the globular cluster He 
values inferred from them). This is particularly important if the globu- 
lar clusters in question have blue extensions to their horizontal 
branches, because it is the blue horizontal branch stars that appear 
not to ascend the AGB. Star number counts used to ascertain AGB 
lifetimes will also be misleading, unless the proportion of AGB ascen- 
ders is known somehow—for example, via an ascension cut-off in Na 
abundance or horizontal branch colour. 
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The spin Hall effect in a quantum gas 


M. C. Beeler't, R. A. Williams‘, K. Jiménez-Garcia>7}, L. J. LeBlanc’, A. R. Perry! & 1B. Spielman! 


Electronic properties such as current flow are generally independent of 
the electron’s spin angular momentum, an internal degree of freedom 
possessed by quantum particles. The spin Hall effect, first proposed 40 
years ago’, is an unusual class of phenomena in which flowing particles 
experience orthogonally directed, spin-dependent forces—analogous 
to the conventional Lorentz force that gives the Hall effect, but opposite 
in sign for two spin states. Spin Hall effects have been observed for 
electrons flowing in spin-orbit-coupled materials such as GaAs and 
InGaAs (refs 2, 3) and for laser light traversing dielectric junctions’. 
Here we observe the spin Hall effect in a quantum-degenerate Bose gas, 
and use the resulting spin-dependent Lorentz forces to realize a cold- 
atom spin transistor. By engineering a spatially inhomogeneous spin- 
orbit coupling field for our quantum gas, we explicitly introduce and 
measure the requisite spin-dependent Lorentz forces, finding them to 
be in excellent agreement with our calculations. This ‘atomtronic’ 
transistor behaves as a type of velocity-insensitive adiabatic spin 
selector, with potential application in devices such as magnetic’ or 
inertial® sensors. In addition, such techniques for creating and mea- 
suring the spin Hall effect are clear prerequisites for engineering topo- 
logical insulators’* and detecting their associated quantized spin Hall 
effects in quantum gases. As implemented, our system realizes a laser- 
actuated analogue to the archetypal semiconductor spintronic device, 
the Datta—Das spin transistor®”®. 

The spin Hall effect (SHE) is generated by spin-dependent forces 
orthogonal to a particle’s motion—akin to the Lorentz force—that can 
act on electrons”*”’, photons’ or, as here, neutral atoms. Each of these 
has an internal, or ‘spin’, degree of freedom (a generalization of conven- 
tional quantum mechanical spin) that can be either up or down, creating 
a spin-1/2 (or pseudospin-1/2) system. In materials, microscopic spin- 
orbit coupling (SOC) induces the SHE in one of two primary ways: by 
means of an intrinsic mechanism driven directly by SOC’ or by means 
of an extrinsic mechanism that additionally requires scattering from 
impurities’”’. The motion of spins in systems with a SHE is strikingly 
similar to the motion of charges in an external magnetic field, but with 
equal and opposite effective Lorentz forces for each of the two spin states. 
Thus, just as the Lorentz force gives rise to the Hall effect for charged 
particles, spin-dependent Lorentz forces (SDLFs) generate SHEs. 
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b Modified dispersion versus y position 


In the Hamiltonian description of quantum mechanics, forces are 
described in terms of associated potentials. For example, a magnetic 
field B= V X A is generated from a vector potential A that enters into 


the Hamiltonian H = (p—qoA) 2m with canonical momentum p, 
charge qo and mass m (‘hats’ on variables indicate quantum mecha- 
nical operators acting on continuous degrees of freedom). We engi- 
neered a vector potential that depends on an effective spin degree of 
freedom with opposite sign for the two effective spin states. This can 
create a SDLF and a SHE when the spins move perpendicularly to the 
resulting spin-dependent B. 

More formally, this vector potential can be expressed as a vector of 
2 X 2 matrices, leading toa relationship between the generalized vector 
potential qgA—>.A and generalized magnetic field qgB—>B (ref. 14; 
‘checks’ on variables indicate quantum mechanical operators acting 
in pseudo-spin space): 


B=VxA-; AXA (1) 


The Heisenberg equations of motion show that Bis the generalized 
magnetic field in a spin-dependent Lorentz force law (Methods). The 
first term in equation (1) is analogous to the conventional magnetic 
field, and the second term is non-zero only when the vector compo- 
nents of A do not all commute, that is, when A is non-Abelian. The 
generalized Lorentz force for the two spin states can be equal and 
opposite, for example when B= By 63€:, where Bo describes the field’s 
magnitude; 6), 2 and 63 are the 2 X 2 Pauli matrices; and e,, ey and 
e, are the three Cartesian unit vectors. 

Two different classes of vector potentials (unrelated by gauge trans- 
formations) lead to this magnetic field, one resulting from each term 
in equation (1). For example, in two-dimensional material systems, 
almost every possible form of linear SOC—combinations of linear 
Dresselhaus’® or Rashba’® couplings—is equivalent to a spatially uni- 
form non-Abelian vector potential with — i(A x A) /ho 63e, (Methods 
and ref. 17). In contrast, we engineered a spin-orbit-coupled Hamil- 
tonian with a spatially dependent, Abelian vector potential that produces 
Vx Ac G3e,. 


Figure 1 | Schematic of experimental set-up. 

A a, Raman beams with frequencies w and m + 6a 

: propagating along e, coupled two states in the f = 1 
: ground-state manifold of °’Rb. Dynamic control of 
= an optical trapping beam propagating along e, 

A allowed the BEC to be moved along e,, giving a 
iT time- and position-dependent Raman coupling. 
The Raman coupling altered the free-particle 
dispersion along e,, creating double wells”! in 

A quasimomentum q. b, The modified dispersions, 
E(q), shown for the three different y positions (i, ii 
and iii) marked in a. We associate states near the 
minimum of each well with dressed spins, and 

A identify the location of the minima with a vector 


=1 


0 F > potential A. 
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Figure 2 | Spin Hall shear. a, Raman coupling strength versus y position, fitted 
with the Raman lasers’ Gaussian profile. b, The observed shear coefficient, 

Sxy (see text), was opposite for each spin and its magnitude followed the derivative 
of a Gaussian function (solid lines). All uncertainties, s.d. of ~5 measurements. 
c-f, Representative two-dimensional spin-momentum distributions observed 
after TOF at different y positions. For the data in b-f, the magnitude of the effect 
was enhanced by elongating the BEC along e, (Methods) and so sampling a greater 
range of the vector potential. The measurements in Figs 3 and 4 were made in the 
portion of the laser shaded in grey in a, where V x A is large and nearly uniform. 


eno=4F, f AO=2.3E, 


wm 00% 


% |’) 


e, 


The relationship between these two distinct vector potentials is 
unusual. Although the equations of motion describe the same SDLF 
leading to an intrinsic SHE, the associated energy spectra are different 
(for example, in the two-dimensional material systems discussed above, 
(B,H] 40, implying that B is time dependent in the Heisenberg pic- 
ture). However, both can give rise to time-reversal-invariant topological 
insulators. The case with a spatially uniform vector potential mirrors 
the typical situation in materials where the intrinsic SOC leads to topo- 
logical band structure®. The case with a spatially dependent vector 
potential leads to the simplest conceptual example of a topological 
insulator: two superimposed quantum Hall systems with equal but 
opposite magnetic fields’* (a single quantum Hall system is a topological 
insulator, but with broken time-reversal symmetry). Both types of vec- 
tor potential exhibit the quantum SHE (QSHE) leading to topological 
insulators. Those resulting from spatially dependent vector potentials 
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Figure 3 | Spin-polarized SHE. a, Spin-dependent forces along e, from motion 


along ey. b, Acquired momentum along e, versus final momentum along e,: blue 
circles denote |{’) and red circles denote | |). The solid curves are solutions of 
the Heisenberg equations of motion for each spin, fitted to the data with Q as the 
only free parameter (Methods). The resulting Q is within 15% of our measured 
coupling strength of 2.5(2)Ep at y= —115 jum, the centre of the spatial region 
sampled by the atoms during the measurements (grey shaded region in Fig. 2a). 
Uncertainties (in plotted momenta and @), s.d. of ~5 measurements. 


are a direct extension of the quantum gas SHE demonstrated in this 
work” but are impractical in material systems (Methods and Sup- 
plementary Information). 

We realized the SHE with ultracold atoms following the proposal of 
ref. 20 by subjecting pseudospin-1/2 *’Rb Bose-Einstein condensates 
(BECs) to spin- and space-dependent vector potentials. Two laser 
beams (which we will refer to as ‘Raman lasers’) with wavelength 4, 
propagating in opposite directions parallel to e,, coupled the |f= 1; 
mp=0,—1) =|f, |) spin states comprising our pseudospin-1/2 sys- 
tem (in analogy to the spin-1/2 electron) with strength Q (Fig. 1a). The 
wavelength determines the single-photon recoil energy, Eg =" kx /2m, 
momentum, /ikp = 27h//, and velocity, vp = kg/m, where m is the 
mass ofa ®’Rb atom and 27/1 is Planck’s constant. In this configuration, 
the Hamiltonian describing motion along e, includes an effective SOC 
term*'*, altering the dispersion relation as shown in Fig. 1b. Thus 
modified, the dispersion relation of these laser-dressed atoms features 
two degenerate wells, each displaced from zero by an amount A= 
kg[1—(hQ/ 4ER)’]"! > for hQ<4Ep (Methods Summary). Particles 
with momenta near these minima can be thought of as dressed spin 
states |{’, |") (which we will colloquially refer to as spin states when no 
ambiguity is possible) in the presence of a vector potential A=Aaze,. 
Given that Q depends on the intensity of the Raman lasers, A has the 
spatial dependence of the Raman lasers’ Gaussian intensity profile. 

The spatial dependence of A gives rise to a SHE in our quantum 
gas’. To probe the mechanism underlying the SHE, we abruptly 
changed A and observed spin-dependent shearing of the atomic cloud 
(Fig. 2). We then observed—for a time-independent A—the resulting 
SHE using two techniques: we propelled atoms in either state |{’) or 
state | |’) along e, and detected a spin-dependent Lorentz-like response 
along +e, (Fig. 3); and, using a mixture of both dressed spin states, we 
used the SDLF to realize a spin transistor (Fig. 4). 
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Figure 4 | Spin Hall currents. a, Calculated spin current versus potential 
gradient, V’, and coupling strength, Q. The two cuts (black lines) show the 
parameters at which measurements of (j,,.) were made. b Spin current, (j,,x), 
versus V’. We note that HQ(y = —115 tm) = 2.3(2)Eg. The central solid curve 
is a fit of our model to the data, with fitted value HQ; = 2.05Eg. The remaining 
curves are the modelled response at iQ = 1E (lower magnitude) and 2.5ER 
(higher magnitude). Inset: FET drain-source current, Ips, versus drain-source 
voltage, Vps, for three different gate-source voltages, Ves = 1.5, 2.4 and 3.5 V 
above threshold. c, Dependence of spin current on coupling strength at a fixed 
trap displacement from yo = —122 jum to yp = —95 um with V’ ~ 27//hay. 
The solid curve is the fit of our model to the data (Methods). The scatter in the 
data is reflective of typical uncertainties. 


These experiments began with BECs of 5 X 10* atoms prepared 
in lt) ||) or mixtures thereof, confined in a crossed-beam optical 
dipole trap with typical axial frequencies w,/2n ~ 35 Hz, @,/2n 
~35Hz and w,/2n~100Hz. The 14=790.13-nm Raman laser 
beams, travelling along +e,, had 170-|1m waists (1/e” radius, where e 
is Euler’s number). We moved the BECs along e,, sampling this 
inhomogeneous Raman laser profile, by displacing the appropriate 
trap beam. At any given initial y position, yp, we then adiabatically 
turned on the Raman lasers in 150ms, Raman-dressing the BEC?! 
and transforming our initial spin states into their dressed counterparts, 
at rest?°’” (Methods). 

We explored the spin and space dependence of the vector potential 
A\(y) by observing the response of the BECs to abrupt temporal changes 
in A. When A depended on y, these changes sheared the BECs’ density 
distribution. We prepared spin-polarized BECs at a variable position, yo 
(Methods). Each Raman-dressed BEC therefore sampled a range of 
Raman coupling strengths across its 40-j1m diameter (Fig. 2a). When 
the Raman lasers were abruptly turned off, the initially motionless BEC 
experienced a spin-dependent ‘electric’ force, — 6.4/0t, resulting from 
a time-changing vector potential along e, (ref. 28). We probed this 
system by switching off the dipole trap and the Raman beams in less 
than 1 js and absorption-imaging the atoms after a 30-ms time of flight 
(TOF; its duration was common to all of our measurements). 

Because A(y) depended on both spin and y position, we observed a 
spin- and yo-dependent shear” in the density distribution (Fig. 2b) 
after TOF, described by n(x, y,z) «x 1- (x/R,)* (yIR,)* (z/R,)? 
SxyXy/R,Ry. where R,, R, and R, are the BEC’s Thomas—Fermi radii. 
The spatial dependence of A is quantified by the shear coefficient, s,,, 
which is obtained by fitting this distribution (integrated along e-) to the 
TOF BEC density distribution. The spin-dependent nature of the vec- 
tor potential is evident in the opposite sign of the shear for each spin 
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(Fig. 2b-f) and in that the magnitude of the shear coefficient follows 
the local derivative of the vector potential at the centre of the BEC. 

We first observed the SHE using spin-polarized BECs. This would 
be atypical in condensed-matter systems, where both spins are usually 
present. After preparing a spin-polarized BEC at a position, yo, between 
Ymin = —135 um and ynax = —95 um (grey shaded region in Fig. 2a, a 
region over which the SDLF was both reasonably large and uniform), 
we abruptly displaced the centre of the harmonic trap to yg = Vmax OF 
Ye=Ymin- This displacement can formally be understood as resulting 
from an applied potential with gradient V’. The atoms accelerated to a 
final ymomentum /iK, in ~7 ms (one-quarter of the e,-trap period). 
During this time, the SDLF accelerated the atoms perpendicular to their 
instantaneous momentum, resulting in a final x-momentum /K,. By 
waiting this quarter-period after trap displacement, we ensured that the 
atoms always arrived at yr (regardless of the choice of yo). Subsequently, 
the trap was turned off abruptly (in less than 1 |1s), the Raman lasers were 
turned off slowly by comparison with dressed-state bandgaps (~500 Ls), 
and the atoms were imaged after TOF to determine their final momen- 
tum (Fig. 3b). With this turn-off procedure, the atoms experienced a 
force —0A / 0t along e, (independent of yo) that shifted the final centre- 
of-mass position after TOF from the observed position of atoms released 
in the absence of A (Methods). We calibrated this zero-momentum 
TOF position by detecting atoms released from rest at y,. 

Each spin-polarized BEC acquired a momentum along e, that was 
directed oppositely for the two spins and related to its final momentum 
along e,, demonstrating an intrinsic SHE. We modelled the dynamics 
of each spin (Fig. 3b, solid curves, and Methods) by solving the 
Heisenberg equations of motion. Because our atoms remain in the 
lowest-energy band plotted in Fig. 1b, the Heisenberg equations of 
motion reduce to classical dynamics subject to the spin—orbit-coupled 
dispersion shown in Fig. 1b. The model predicts both K, and K, as 
functions of initial and final trap displacement. We leave 2 as a fitting 
parameter, the value of which is within 15% of our calibrated value. 
The results of this model are plotted along with the data in Fig. 3b. 

Next we realized the SHE in a configuration analogous to solid 
systems by using mixtures of both spins. In the presence of both spins, 
we define average spin and particle current densities (j,) = (jy) — Gi) 
and (jp) = (jy) + Gi), where the average current density for spin i 


(either }’ or |’) is (j;) = a/v) ni(r)v;i(r) dr, with density n, velo- 
Vv 


city v, and in situ BEC volume V. An equal current of each spin moving 
in the same direction corresponds to a pure particle current, and an 
equal current of each spin moving in opposite directions gives a pure 
spin current. 

This third class of experiments started with BECs in a mixture of 
both spins (Methods). We generated a pure particle current using the 
trap displacement technique described above. As before, the system 
evolved under the SDLF for ~7 ms, after which time the atoms were 
released from the trap and the Raman lasers adiabatically turned off 
(Methods). Each TOF image contained information about both 
dressed spin states, allowing us to determine the spin and particle 
currents simultaneously. We modelled the resulting spin current along 
e,.as a function of coupling strength, Q, and potential gradient, V’. The 
system’s spin response, (js) = (js) * €x, is shown in Fig. 4a. By varying 
one parameter at a time (Fig. 4a, black lines), we measured the spin 
current as a function of V’ (Fig. 4b) and as a function of Q (Fig. 4c). In 
both cases, the experiment agrees with our model. 

Despite the existence of the SHE in all spin-orbit-coupled metals 
and semiconductors, the technology for studying the SHE was 
developed only recently. Soon afterwards, the SHE was exploited to 
develop spintronic devices’’. In this spirit, our experiment describes an 
externally actuated ‘atomtronic’ bipolar spin transistor®”®, where Q 
plays the part of the transistor’s gate voltage and the potential gradient, 
V’, is analogous to the drain-source voltage. The spin current turns 
on abruptly at AQ ~ 1E (Fig. 4c), with a final spin current set by 
the potential gradient. For a given Raman coupling (‘gate voltage’), 
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however, the spin current turns on smoothly with positive or negative 
potential gradient (Fig. 4b). This similarity between our system and a 
field-effect transistor (FET) is further highlighted in Fig. 4b, where the 
three black curves modelling our system’s response at three different 
Raman coupling strengths are compared with the characteristic res- 
ponse of a FET’s drain-source current as a function of drain-source 
voltage at three different gate-source voltages. 

In atomic systems, other techniques can separate particles according 
to spin, such as the well-known Stern-Gerlach effect. Our technique 
complements these, because the spin-dependent force depends not on 
the atoms’ positions (as in the Stern—Gerlach effect), but on their 
velocities. For example, a device with a finite SHE interaction region 
will deflect an incoming atomic beam by an amount independent of 
the velocity with which the atoms enter the region; although an 
increase in initial velocity decreases the interaction time, the perpen- 
dicular force increases (for interaction times much less than 27m/Bo). 
For devices using the Stern—Gerlach effect, the deflection depends only on 
the interaction time, which changes with initial velocity. A spin transistor 
might operate using either our SDLF or a Stern—Gerlach-type force, but 
in each case its behaviour will be quite different. For example, using our 
transistor as the input and output beam splitter in Mach-Zehnder-type 
inertial sensors” could yield coherent adiabatic momentum splitting that 
is independent of the atoms’ longitudinal velocity profile. 

We have demonstrated an intrinsic SHE in a quantum gas using 
a precisely engineered spin- and space-dependent vector potential. 
Systems such as this—with the experimental parameters available at 
present—are candidates for a.c. gravity gradiometers, when applied to 
dilute clouds where interaction effects are negligible. In addition, time- 
reversal-invariant topological insulators manifest the QSHE"*. Using 
present technologies, our method for producing the SHE could produce 
the QSHE in an ultracold gas of fermionic “°K (Methods and Sup- 
plementary Information). Despite the technical challenges, the simpli- 
city of our setup—two atomic spin states and two oppositely directed 
lasers—makes our approach an appealing method for achieving the 
QSHE. In similar parameter regimes, it may be possible to realize exotic, 
interacting topological insulators using Bose gases'*”°. 


METHODS SUMMARY 
System preparation. A bias magnetic field of By = 2.1 mT lifted the degeneracy of 
the |f= 1, mp = 0, +1) spin states in the electronic ground-state manifold of 87Rb, 
leading to an energy-level splitting, AE = 2h X 15 MHz, between |m; = —1) and 
|mp = 0) that matched the /idw energy difference between the Raman laser beams’ 
photons, where dw is the frequency difference between the Raman beams. Owing 
to the large bias field, the |mp = +1) spin state was detuned from Raman res- 
onance by 17.8ER and was inactive in our experiments. 
In the limit of zero Raman coupling, each dressed spin continuously connects to 
a bare spin with quasimomentum /|q| = hkg. To load a specific dressed spin, we 
started with a BEC in |mp = —1), |mp = 0) or a mixture of those states and turned 
on the Raman lasers in 150 ms. During experiments on spin-polarized BECs, we 
avoided any undesired population of the other dressed spin state by applying a 
detuning id = AE — hdw = 0.15Eg during the ramp up of Q and then by shifting 
to resonance (6 = 0) with a 1-ms ramp of Bo. An acousto-optic modulator shifted 
the position of the dipole trap beam propagating along e,, allowing controlled 
translation of the atomic sample along ey. 
Dressed states. The single particle properties of our system are well described by 
the Hamiltonian” 
2722 4 2 4 p2 2p + 
_ hq +k, +k) fe hQ é li'keq Sy4Eei 
2m 2 m 


for resonant Raman coupling, as we use here. The eigenenergies 
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define a pair of effective dispersion relations, the lower of which, € _ (q), is plotted 
for k, = k, = 0 in Fig. 1b for a selection of coupling strengths. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


System preparation. A By = 2.1-mT bias magnetic field lifted the degeneracy of 
the |f= 1, mp = 0, +1) spin states in the electronic ground-state manifold of 87Rb, 
leading to an energy-level splitting, AE = 2mh X 15 MHz, between |m; = — 1) and 
|m, = 0) that matched the dw energy difference between the Raman laser beams’ 
photons, where 6 is the frequency difference between the Raman beams. Owing 
to the large bias field, the |mp = +1) spin state was detuned from Raman res- 
onance by 17.8E, and was inactive in our experiments. 

In the limit of zero Raman coupling, each dressed spin continuously connects to 
a bare spin with quasimomentum /|q| = lik. To load a specific dressed spin, we 
started with a BEC in |m; = —1), |mp = 0) ora mixture of those states, and turned 
on the Raman lasers in 150 ms. During experiments on spin-polarized BECs, we 
avoided any undesired population of the other dressed spin by applying a detuning 
hé = AE — héw = 0.15Ex during the ramp up of 2 then by shifting to resonance 
(6 = 0) with a 1-ms ramp of Bp. An acousto-optic modulator shifted the position of 
the dipole trap beam propagating along e,, allowing controlled translation of the 
atomic sample along e,. 
Dressed states. The single particle properties of our system are well-described by 
the Hamiltonian’! 


W@+R+R). hQ_ Wkp| ‘ 
gVgti) 4 ot 5 ea 
2m 2 m 
for resonant Raman coupling, as we use here. The eigenenergies 
iP (kK, +k) 
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define a pair of effective dispersion relations, the lower of which, € _ (q), is plotted 
for ky = k, = 0 in Fig. 1b for a selection of coupling strengths. 

Quantum SHE. Our technique for producing the SHE can be extended to realize 
the QSHE in two-dimensional ultracold Fermi gases. A simple example system 
that exhibits the QSHE can be constructed by overlapping two integer quantum 
Hall systems with filling factors of v = 1 and opposite magnetic field, the second of 
which implies that they have opposite Chern numbers'*. Although this con- 
struct—two separate, spatially overlapped electron systems that experience oppo- 
site magnetic field—is artificial, the QSHE can arise from an equal mixture of spins 
experiencing strong opposite, spin-dependent ‘magnetic’ fields. 

To understand intuitively how this might work, we consider our effective pseu- 
dospin Hamiltonian in two dimensions for iQ < 4Eg (ignoring the optical con- 
finement, the scalar light shift from the Raman lasers and the zero-energy shift 
from the Raman dressing): 


Here 1 is the 2 X 2 identity matrix, A=/kp[1—(Q/4Ex)’]'” is the magnitude 
of the Raman-laser-induced vector potential, A=A 63, is the matrix-valued 
vector potential, p is the canonical momentum and m* is the effective mass tensor. 
Here, pseudospin is a good quantum number and the system can be thought of as 
two independent systems that respond oppositely to temporal and spatial gradi- 
ents in A. By introducing a large, non-zero V x A, each spin state taken separately 
could be driven to the integer quantum Hall regime, thereby creating a QSHE ina 
system composed of an equal mixture of both spins. 

Our specific proposal to extend our work and realize the QSHE uses “°K con- 
fined in a quasi-two-dimensional geometry in the e,-e, plane. Two Raman lasers 
counterpropagating along e, couple together two magnetic sublevels in the 
|f = 9/2) ground-state manifold. Tailoring the Raman lasers (using a spatial light 
modulator’, for instance) to have a position-dependent coupling AQ(y)= 


4ER,/L5—y? i L, for ye(0,L,] along e, produces a linearly varying A. Each 
pseudospin experiences an oppositely directed, uniform, synthetic magnetic field 
with cyclotron frequency @, = hkp/mLy for ye(0,Ly}. 

To reach the QSHE regime, the thermal energy scale, kgT, the Fermi energy, Ex, 
and the cyclotron energy, Aw, must satisfy kgT <Ep~hw, (so that the Fermi 
energy falls in the gap between the ground and first Landau levels). Here, kg is 
Boltzmann’s constant and T is the temperature. The cyclotron frequency therefore 
sets the energy scales necessary to produce a QSHE. For realistic system sizes of 
5-10 jum, the cyclotron frequency is «,/2m ~ 100 Hz. In Supplementary Informa- 
tion, we make this argument rigorous for our actual experimental configuration. 
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Notes on Fig. 2. The Raman coupling strength in Fig. 2a was measured as described 
in refs 23, 35. For the data in Fig. 2b-f, the aspect ratio of the BEC was adjusted from 
its typical cylindrical symmetry to be 50% longer along e, than along e, by adjusting 
the optical trap, and the atom number was maintained at >10°. 

Measurement and analysis. To measure the atoms’ momenta, the optical con- 
finement was turned off abruptly while the Raman lasers’ intensity was linearly 
ramped to zero in 0.5-1 ms. This procedure transferred each dressed spin to a 
bare spin moving with an x momentum equal to its quasimomentum q and a 
ymomentum equal to its in-trap ymomentum, K,. A magnetic field gradient 
applied for a few milliseconds during the 30-ms TOF separated the two bare spins 
along e, through the Stern—Gerlach effect. After this separation, we measured the 
atomic density distribution and obtained its mean position. To determine the 
atoms’ in situ momenta, we referenced the measured TOF positions to the TOF 
positions observed for atoms under the same experimental conditions, but at rest. 
For example, when the trap was suddenly displaced as in Fig. 3 or 4, the reference 
position was determined by adiabatically dressing the atoms at the final trap 
position and measuring the TOF position. Subtracting the TOF position of the 
abruptly displaced atoms from the reference TOF position allowed us to determine 
the in-trap momentum. 

This measurement of the momenta contained two contributions that biased the 
TOF positions away from the actual momenta. If the atoms do not reach their 
equilibrium position in the trap before TOF begins, our subtraction procedure 
does not yield the actual velocity, because this initial displacement is interpreted as 
momentum after TOF. According to our simulations, this resulted in a systematic 
underestimation of the momentum along e, and e,. In addition, to compensate 
gravity during displacement of the optical trap, the overall intensity of the optical 
trapping beams was increased by 25% at the same time as the position of the optical 
trap was changed. Owing to the competition between the optical trap and the near- 
linear spatial dependence of the energy minimum of the Raman-dressed bands, 
this power increase shifted the equilibrium position of the atoms along e, even in 
the absence ofan optical trap displacement. We measured the equilibrium position 
of our atoms by increasing the power of the optical trap for ~7 ms without 
displacing it, leading to a small difference in our measured zero momentum from 
the actual zero momentum. These effects, which result in a momentum correction 
of up to 20%, were all included in our simulations. 

Small fluctuations in our laboratory magnetic bias field lift the energy degene- 
racy of the two pseudospin states, leading to fluctuations in the pseudospin popu- 
lation distribution. When working with a mixture of pseudospins, we discarded 
any measurement for which the population of one spin state was greater than 
150% of the other, resulting in up to 60% of the data from each sequence being 
omitted from analysis. In addition, when both dressed spins were used together, 
there was an initial spatial segregation of the spins owing to a repulsive interaction 
between them”'”*?’, Although the in situ spatial distribution of the spins was 
modified before the experiment began, this interaction energy did not significantly 
affect our momentum measurements, because the in situ displacement was small 
compared with the typical TOF displacements giving the momentum signal. 
Simulations. Because transitions between the dressed-spin bands are energetically 
suppressed owing to the large energy gap between bands (compared with the 
energy of the dynamics), the Heisenberg equations of motion for our system were 
the same as Hamilton’s classical equations of motion in the lowest band. In our 
simulation, the classical Hamiltonian included the modified position-dependent 
dispersion relation along e, (Fig. 1b), the scalar potential from the Raman beams, 
the scalar potential from the optical dipole trap and the gravitational potential. 
The dispersion relation was calculated by diagonalizing our system’s spin- 
orbit-coupled Hamiltonian” and retaining only the lowest energy band. It is the 
position-dependent modified dispersion relation that drives the observed SHE. 
The solutions to Hamilton’s coupled differential equations yielded values for the 
position and momentum (or quasimomentum) in all three spatial directions as 
functions of time. For a given dressed spin, the simulated mechanical momentum 
K, was the difference between q(t) and the location of the minimum of the 
dispersion curve associated with that dressed spin. Our model does not predict 
values of K,> 1.2hkg that were observed in the experiment, but this can be 
explained by deviations of our optical trap from the ideal Gaussian beams used 
in our model. 

Linear Dresselhaus and Rashba SOC as a vector potential. Consider the Rashba 
and linear Dresselhaus SOC Hamiltonians in two dimensions”*: 


and 
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Here f; is the momentum along the ie{ ex ,€y,€z } spatial direction and « and f are 
the respective strengths of the Rashba and Dresselhaus SOCs. The total 
Hamiltonian containing both of these terms 


a 
Hsoc = —— +Hp+He 
2m 


can be expressed as 
‘roc ipa" ie 4 
Hsoc = 5 (1p A) RP (a + B°) 


with 
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The generalized magnetic field from this vector potential is 


bd y j y y 2 
B=vxA--AxA - (o2 parle. (3) 


Lorentz force. A generalized magnetic field defined by equation (3) gives a gen- 
eralized Lorentz force law. Following ref. 37, we start with a Hamiltonian 


ae eee 
H=—(ip— 
im | PA) 
containing a non-Abelian vector potential in three spatial dimensions with a finite 


number of internal degrees of freedom. The Heisenberg equation of motion for the 
position x is 


dx; 1 

dt ih | 
We identify [7 as the particle's mechanical momentum. The commutator 
1;,11)] = ihe By, or B=V x A-(i/NA x A, defines the generalized mag- 
netic field (&j is the Levi-Civita symbol). For Abelian vector potentials, the dif- 


5A] : ( 1p; Ai) = me 


ferent components of A all commute, and this expression for B reduces to the 
familiar B=V x A. 
We derive the Lorentz force law starting with the Heisenberg equation of 
motion for the mechanical momentum: 
dit 
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which is the ith component of the symmetrized Lorentz force law 
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Because the B field from linear combinations of Rashba and Dresselhaus SOCs 


(equation (3)) and the B from our experiment are both proportional to &3, the 
equations of motion for the mechanical momentum in the two cases are the same. 


However, for the vector potential in equation (2), B does not commute with the 


Hamiltonian, leading to an additional Heisenberg equation of motion for B which 
must be included. Despite this additional complexity, the SDLF generates the SHE 
in both situations. 

Gauge invariance. The magnetic field defined by equation (3) is not gauge invari- 
ant. The definition of gauge transformations is generalized in any discussion of 
non-Abelian vector potentials. For the SU(2) symmetry group, a gauge transform 
is a position-dependent unitary rotation in spin space***? 


WoV (Xb 
with 
V(x) = exp|[ia(%): 6] 


where @ is an arbitrary vector of functions of x and a is the vector of 2 X 2 Pauli 
matrices including the identity. Under this gauge transformation, the Lagrangian 
must remain unchanged, requiring the magnetic field to transform according to” 


B>V(®)BV"(%) 
Despite the lack of gauge invariance of the magnetic field, an Abelian magnetic 
field cannot be gauge transformed to a non-Abelian field. 

This definition for gauge transforms can be generalized to a gauge with gene- 
rators from any continuous symmetry group by replacing & with a vector of the 
generators of the symmetry group. For instance, in the case of a scalar vector 
potential from classical electrodynamics with U(1) symmetry, the generator of 


the symmetry group is a scalar, and the gauge transformation becomes the familiar 
position-dependent phase. 
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A temporal cloak at telecommunication data rate 


Joseph M. Lukens!, Daniel E. Leaird' & Andrew M. Weiner! 


Through advances in metamaterials—artificially engineered media 
with exotic properties, including negative refractive index’ *—the 
once fanciful invisibility cloak has now assumed a prominent place 
in scientific research* ’. By extending these concepts to the temporal 
domain”, investigators have recently described a cloak which hides 
events in time by creating a temporal gap in a probe beam that is 
subsequently closed up; any interaction which takes place during 
this hole in time is not detected’. However, these results are limited 
to isolated events that fill a tiny portion of the temporal period, 
giving a fractional cloaking window of only about 10~* per cent 
at a repetition rate of 41 kilohertz (ref. 15)—which is much too low 
for applications such as optical communications. Here we demon- 
strate another technique for temporal cloaking, which operates at 
telecommunication data rates and, by exploiting temporal self- 
imaging through the Talbot effect, hides optical data from a receiver. 
Wesucceed in cloaking 46 per cent of the entire time axis and conceal 
pseudorandom digital data at a rate of 12.7 gigabits per second. This 
potential to cloak real-world messages introduces temporal cloaking 
into the sphere of practical application, with immediate ramifica- 
tions in secure communications. 

As in the first demonstration of a ‘time cloak’!®, the theoretical 
foundation for our cloak is space-time duality, the formal mathematical 
equivalence between paraxial diffraction and narrowband dispersion’*”. 
This correspondence permits the extension of concepts typically assoc- 
iated with spatial Fourier imaging into the time domain. For example, 
just as a traditional thin lens applies a quadratic phase in space, a 
temporal lens can be constructed that applies a quadratic phase profile 
in time. But although time lenses with extremely large chirp coeffi- 
cients can be obtained through parametric nonlinear interactions'*””, 
such schemes are not easily implemented at gigahertz rates. Instead, 
electro-optic phase modulators prove a more suitable choice. Phase 
modulators, which are standard components in optical communica- 
tions, offer wavelength transparency, high radio-frequency bandwidth, 
and simplicity of operation, because they require only a single radio- 
frequency input and are optically linear”®. 

Yet because phase modulators are typically driven with sinusoidal 
voltages, which are only approximately quadratic over a small temporal 
window, they suffer from severe temporal aberrations”’. Such distortions 
are particularly harmful in implementing a temporal cloak, because a 
continuous-wave input necessarily extends well beyond the parabolic 
peaks of the sinusoid. Moreover, the original temporal cloak uses split time- 
lenses, which apply a discontinuous frequency chirp to the continuous- 
wave probe; after propagating through dispersive fibre, the spectral 
content of the waveform separates in time, leaving a gap with zero 
intensity’®. This discontinuous chirp requires that the parabolic approxi- 
mation remain valid all the way to the edges of the time lens—precisely 
where it breaks down completely for a sinusoid. Further, even if we 
generate a non-sinusoidal radio-frequency signal that more accurately 
approximates a parabola, replicating the chirp discontinuity still requires 
extremely high bandwidth at repetition rates suitable for telecommuni- 
cations. (See the Supplementary Information for further discussion.) 
Under these restrictions, an alternative to the split time-lens is required 
for temporal cloaking in the gigahertz regime. 


Interestingly, the desired transformation of continuous-wave light 
into clean, high-extinction pulses is closely related to the generation of 
optical frequency combs through electro-optic modulation. In this appli- 
cation, the goal is to convert a continuous-wave input into a broadband 
frequency comb with a smooth spectrum. One such method for flat 
comb generation exploits a temporal version of the Talbot effect. Observed 
in spatial optics as early as 1836, the Talbot phenomenon yields perfect 
regeneration of the optical field at discrete distances away from a 
periodic grating’. Through space-time duality, a temporal analogue 
arises”’. Specifically, for an electric-field envelope periodic at the radio 
frequency @,ep, where ‘rep’ indicates repetition frequency, modulated 
by an optical carrier at frequency ~ and traversing a medium described 
by propagation constant (wm) = By + B\(@ — wo) + 2 B2(@ — po)’ 
the waveform exactly reproduces itself at multiples of the Talbot dis- 
tance Ly = 41t/|B5|@,ep - 

As a special case, when a continuous-wave input is sinusoidally 
phase-modulated at an amplitude of 1/4, then propagated through 
the fractional Talbot distance L+/4, the output waveform consists of 
high-extinction, 50% duty-cycle pulses**”’. These pulses can be imaged 
effectively by a second application of sinusoidal phase, because at this 
point the optical energy lies primarily within a window where the 
quadratic phase of an ideal time lens is well approximated. Sub- 
sequent dispersion chosen to satisfy the temporal imaging condition 
compresses these pulses even further, thereby creating large temporal 
gaps, the hallmark of a cloak. And this compression is achieved with 
phase-only elements—which are reversible apart from linear insertion 
loss—so it can be undone with inverse dispersion and modulation, thus 
completing the temporal cloak. The spatial equivalent of this cloaking 
circuit is highlighted in Fig. 1, revealing the large cloaking window 
possible with the Talbot effect. We emphasize that no discontinuity 
in the chirp rate is required. 

Figure 2a presents the full experimental arrangement. The first 
phase modulator applies a small phase modulation to the input, and 
a chirped fibre Bragg grating provides the required fractional Talbot 
dispersion. The second phase modulator widens the signal bandwidth, 
and optical fibre compresses the waveform in time. The spectro-temporal 
characteristics of the optical probe at the event plane are summarized 
in Fig. 2b and c; the broadband frequency comb is compressed smoothly 
to an autocorrelation full-width at half-maximum (FWHM) of 11.7 ps, 
corresponding to about 15% of the 78.7-ps repetition period. The 
following fibre, phase modulators, and chirped fibre Bragg grating 
simply undo the effects of their counterparts, leaving a continuous- 
wave output. The extra dispersive link after the final phase modulator 
ensures that the cloak can itself be hidden; that is, when the phase 
modulators are switched off, the applied event appears at the output 
unaltered, as if the cloak were absent completely. This requires that a 
net dispersion of around 0 psnm_' be experienced by the uncloaked 
event, which necessitates the additional dispersive link. On the other 
hand, when the cloak is operational, the output of the last phase modu- 
lator is essentially continuous-wave, so the extra dispersion has no 
impact. As another modification, we employ a 12.3-Gbs_' photore- 
ceiver for detection of the temporal output, presenting bandwidth 
filtering as a cloak enhancer. For relatively narrowband events, a spec- 
tral filter with properly chosen bandwidth can be used to remove 
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Figure 1 | Spatial analogue of temporal cloaking circuit. a, Temporal ray 
diagram highlighting the spatial equivalent of the experimental set-up. 

@, = f,Ly represents the Talbot dispersion for dispersion-compensating fibre 
with dispersion constant /. Owing to the diffractive nature of the Talbot effect, 
temporal ray optics is not strictly applicable, but we nonetheless include this ray 


residual high-frequency sidebands—resulting from cloak imperfections— 
while still passing the event itself intact. This principle, discussed in 
detail in the Supplementary Information, proves extremely useful in 
cloak operation. Under these conditions, the reconstructed waveform 
is as summarized in Fig. 2d and e. The cloak is able to reproduce the 
continuous-wave input: the final spectrum consists of one line, match- 
ing the input spectrum, and the temporal waveform is nearly flat, albeit 
with some parasitic modulation due to cloak imperfections. 
Applying a sinusoid to the electro-optic intensity modulator in the 
event plane, we obtain the results of Fig. 3. We find that the cloak 
completely hides the presence of the perturbation, removing the spectral 
sidebands and turning the high-contrast temporal modulation into a 
nearly flat line. This periodic event enables measurement of a defining 
metric of cloak performance: the cloaking window. To quantify this 
aspect, we look at the photodetected signal’s relative root-mean-square 
fluctuation as the perturbation is shifted in time from the optimum 
cloaking point. For definiteness, the cloaking window is taken as the 
temporal offset at which this fractional modulation has increased to 
one-half the value in the uncloaked case; Fig. 3d furnishes the results of 
this measurement. A cloaking window of 46% is found, which represents 
a conservative estimate: the sinusoidal modulation is of significant 
duration itself, so the actual cloaked region is wider than that indicated 
simply by the temporal offset. Unlike the arrangement in ref. 15, which 
can be viewed as a cloak of temporally isolated events, the periodicity in 
ours cannot be ignored; in fact, it is precisely this periodicity which 
permits use of the Talbot effect. In this sense, it is profitable also to 
compare our temporal cloak to metamaterial cloaking arrays**. In a 
recent experiment” using tapered gold-coated waveguides, about 20% 
of the total two-dimensional surface area was cloaked—a number similar 
to what is obtained here. Thus our cloak meets at the intersection of two 
recent metamaterial concepts: the temporal cloak and the cloaking array. 
In addition to hiding a deterministic periodic signal, our cloak is 
also able to mask pseudorandom data. We use an inverted (dark) 
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diagram for visualization. b, Corresponding simulated intensity distribution. 
Wide cloaking windows of zero intensity appear at the temporal focus, nearing 
the duration of the repetition period, T,.,. c, Temporal intensity slices at specific 
locations in the circuit (panels from left to right): first grating, negative lens 
array, event plane, positive lens array and final grating. 


return-to-zero modulation format, which ensures that the optical 
transmission function returns to a maximum during each cycle, as 
required to provide temporal regions through which the compressed 
probe pulses can pass. Cloak performance for dark return-to-zero data 
is summarized in Fig. 4, for both pseudorandom and specific bit 
sequences. When the phase modulators are off, high-contrast voltage 
transitions are evident; when the cloak is turned on, these transitions 
reduce to a single flat line, and the data are effectively cloaked. This 
temporal cloak consequently succeeds in hiding communications at 
will, by simply turning four phase modulators on and off. 

Moreover, future cloaks based on our arrangement have the potential 
for significant improvements, both in terms of operational bandwidth 
and the duration of the cloaked region. Two distinct bandwidths 
deserve consideration: that of the input probe, and that of the event to 
be cloaked. Concerning the probe, the cloak is fundamentally narrow- 
band. The dispersion and phase modulation are selected precisely 
under the assumption of a continuous-wave optical input, and a broad- 
band optical input is not guaranteed to develop into sharp pulses at the 
event plane. On the other hand, the bandwidth of the event could in 
principle be made much wider. Because of the filtering effect of the 
detection scheme (see Supplementary Information), the current cloak 
is admittedly limited to event bandwidths approximately twice the 
modulation frequency. However, this filtering is required only because 
of imperfections in the cloak itself, particularly the difficulty in exactly 
matching two phase modulators. With improved uniformity in the 
phase modulators, such filtering could be removed, permitting distur- 
bances with much broader frequency content. In fact, the event could 
possess a bandwidth wider than that created by the phase modulation 
for probe compression, provided it lies within the passband of the 
optical components used. The disturbance must be temporally restricted 
to allow unity transmission over some fraction of the period, but no 
such restriction is imposed on its modulation bandwidth. Indeed, 
impulsive events with extremely large bandwidths are actually easier 
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Figure 2 | Experimental set-up. a, Schematic of the complete cloaking circuit. 
CW, continuous-wave input laser; PM, phase modulator; CFBG, chirped fibre 
Bragg grating; SMF, single-mode fibre; DCF, dispersion-compensating fibre; 
IM, intensity modulator; AMP, erbium-doped fibre amplifier. b, Comb 
spectrum at the event plane, consisting of 16 spectral lines in the 10-dB 
bandwidth. c, Corresponding intensity autocorrelation, shown over one full 
temporal period. The FWHM is 11.7 ps. d, Spectrum at the output of the 
cloaking circuit, when no event is applied. e, Corresponding temporal output 
measured on a photodetector, compared to the case when all phase modulators 
are off. 
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Figure 3 | Cloaking of sinusoidal modulation. a, Output spectrum when the 
phase modulators are off and a sinusoid is applied to the intensity modulator. 
b, Spectrum when the cloak is on, demonstrating removal of the sidebands in 
a. (Spectra are normalized as in Fig. 2d.) c, Corresponding temporal output. 
When the cloak is turned on, the previously high-contrast modulation is 
reduced to a flat line, hiding this event from an observer. d, Measurement of the 
temporal cloaking window. The fractional modulation reaches one-half that in 
the uncloaked case at a detuning of 18 ps, for a total double-ended cloaking 
window of 36 ps, or 46% of the temporal period. 
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Figure 4 | Cloaking of data. a, Temporal output when length 2°! — 1 
pseudorandom data are applied to the intensity modulator, measured on a 
sampling oscilloscope. The clear transitions between high and low data levels 
present when the cloak is off are completely removed when the phase 
modulators are on. b, Output for a particular sequence of ones and zeros. 
Although the binary data specified on the bottom of the plot are clearly detected 
when the cloak is off, the voltage swings indicative of bit transmission are 
suppressed to a nearly flat line when the cloak is on. 


to cloak than the data examined here, because they fit easily within the 
cloaking window. Additionally, from an operational perspective, the 
data rate of 12.7 GHz could be easily increased to 40 GHz and higher 
with state-of-the-art LINbO; modulators. Because the required values 
of optical dispersion depend on this choice of frequency, one can simply 
choose different dispersive links to convert our cloak at 12.7 GHz to 
any other convenient repetition frequency. 

The 46% cloaking window is currently limited by phase modulator 
performance. The duration of the compressed probe beam (Fig. 2c) is 
inversely proportional to the optical bandwidth after the second phase 
modulator in our experiment (spectrum in Fig. 2b), which is in turn 
directly proportional to the phase modulator’s modulation index”’. 
The modulation index itself is limited by the phase modulator’s maxi- 
mum allowable radio-frequency input power, which prevents much 
shorter probe pulses and wider cloaking windows with the current 
arrangement. Yet by replacing the second and third phase modulators 
each with separate series of cascaded modulators, the net effective 
index can be increased even without improved technology. In fact, this 
principle has already been applied in pulse compression studies”, 
yielding pulses more than eight times shorter than what we obtain 
here. This implies that a fractional cloaking window of over 90% could 
be possible in our set-up, using three cascaded phase modulators 
instead of one. Closely approaching the limit of 100% would at present 
require too many phase modulators to be practical, but it nonetheless 
remains a possibility for the future; nothing inherently prevents it. 

From a more fundamental perspective, our experiments highlight 
the efficacy of the Talbot effect for general cloaking. In a sense, it 
produces its own cloak, even in a single dispersive medium: through 
self-imaging, a phase-modulated signal that is compressed to pulses at 
L,/4 will naturally return to continuous-wave light upon further pro- 
pagation. Therefore, although we used additional time-lensing to sub- 
stantially expand the cloaking window, this is not intrinsically necessary. 
One could envision removing these time lenses entirely and simply 
letting the Talbot effect run its course. A spatial equivalent could then 
be implemented with phase gratings in a simple uniform medium. This 
example stresses yet again the unique insights afforded by space-time 
duality for enriching our understanding of seemingly disparate phenomena. 


METHODS SUMMARY 

All phase modulators are driven by a single low-noise sine-wave generator 
(Agilent E8257D) operating at a frequency of 12.71 GHz. The gain of all radio- 
frequency amplifiers is regulated to ensure matched modulation, either via a 
control voltage or with tunable radio-frequency attenuators. The first and fourth 
phase modulators are driven at a modulation index of 1/4, and the second and 
third at an index of around 21, limited by the maximum allowable radio-frequency 
input power. The continuous-wave laser (Koheras AdjustiK) is operated at about 
1,541.9 nm, and radio-frequency phase shifters are used to align the applied phase 
of each phase modulator. The optical fibre consists of around a kilometre of 
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standard Corning SMF-28e and a dispersion-compensating fibre module from 
Optical Fibre Solutions, with net dispersions of around 17psnm ' and —17 ps 
nm_', respectively; the chirped fibre Bragg gratings were designed by Proximion 
Fibre Systems, and feature approximate dispersions of +400 psnm '. Erbium- 
doped fibre amplifiers are required to compensate for the insertion loss of the 
optical components. 

For sinusoidal modulation, the intensity modulator is driven with the clock 
signal directly and is biased at the half-power point; for data, a bit-error-rate tester 
(Agilent N4901B) generates a non-return-to-zero sequence which is then con- 
verted to the desired return-to-zero drive signal by a digital logic circuit (Hittite 
HMC706LC3C) and applied to the intensity modulator, this time at zero bias. A 
fraction of the final output is split off and recorded on an optical spectrum analyser 
with a resolution of 0.01 nm. The remainder is detected with a 12.3-Gbs_ | photo- 
receiver (Agere 2560A-C02), and the electrical output is measured on a fast sam- 
pling oscilloscope (Tektronix DSA8200). The voltage levels in the cloaked cases do 
not align with the uncloaked peaks because of amplifier saturation, which forces 
the average output power to remain constant; a continuous-wave waveform must 
drop slightly compared to the modulated case to conserve the integrated power. 

Currently, thermo-optic effects in the optical fibre links limit long-term opera- 
tion. Small changes in the refractive index resulting from thermal drift cause the 
timing between complementary modulators to lose synchronization; under the 
conditions in our laboratory, this means that the phase shifters require slight 
readjustment approximately every 15 min. However, replacing all optical fibre 
with equivalent chirped fibre Bragg gratings would remove this instability by 
lowering the total length of silica from over a kilometre to just a few metres, 
thereby making timing drift negligible relative to the radio-frequency period. 
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Heat dissipation in atomic-scale junctions 


Woochul Lee'*, Kyeongtae Kim'*, Wonho Jeong’, Linda Angela Zotti*, Fabian Pauly’, Juan Carlos Cuevas* & Pramod Reddy“ 


Atomic and single-molecule junctions represent the ultimate limit 
to the miniaturization of electrical circuits’. They are also ideal 
platforms for testing quantum transport theories that are required 
to describe charge and energy transfer in novel functional nano- 
metre-scale devices. Recent work has successfully probed electric 
and thermoelectric phenomena”* in atomic-scale junctions. 
However, heat dissipation and transport in atomic-scale devices 
remain poorly characterized owing to experimental challenges. 
Here we use custom-fabricated scanning probes with integrated 
nanoscale thermocouples to investigate heat dissipation in the elec- 
trodes of single-molecule (‘molecular’) junctions. We find that if 
the junctions have transmission characteristics that are strongly 
energy dependent, this heat dissipation is asymmetric—that is, 
unequal between the electrodes—and also dependent on both the 
bias polarity and the identity of the majority charge carriers (elec- 
trons versus holes). In contrast, junctions consisting of only a few 
gold atoms (‘atomic junctions’) whose transmission characteristics 
show weak energy dependence do not exhibit appreciable asym- 
metry. Our results unambiguously relate the electronic trans- 
mission characteristics of atomic-scale junctions to their heat 
dissipation properties, establishing a framework for understand- 
ing heat dissipation in a range of mesoscopic systems where trans- 
port is elastic—that is, without exchange of energy in the contact 
region. We anticipate that the techniques established here will 
enable the study of Peltier effects at the atomic scale, a field that 
has been barely explored experimentally despite interesting theore- 
tical predictions’"’. Furthermore, the experimental advances 
described here are also expected to enable the study of heat transport 
in atomic and molecular junctions—an important and challenging 
scientific and technological goal that has remained elusive’*’. 
Charge transport is always accompanied by heat dissipation (Joule 
heating). This process is well understood at the macroscale, where the 
power dissipation (heat dissipated per unit time) is volumetric and is given 
byj“p, where j is the magnitude of the current density and p is the electrical 
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Figure 1 | Nanoscale thermocouple probes and atomic and molecular 
junctions studied in this work. a, Scanning electron microscope (SEM) image 
of a NTISTP. The electrodes are false-coloured. Inset, magnified image of the 
tip. b, Diagram of a junction created between the NTISTP (cross-sectional 


resistivity. Heating in atomic-scale junctions is expected to be fundament- 
ally different, as charge transport through such junctions is largely 
elastic'*’*. Recent experiments have probed the local non-equilibrium 
electronic and phononic temperatures in molecular junctions'*'* to 
obtain insights into the effect of electron-electron and electron-phonon 
interactions on heat dissipation at the atomic scale. However, experi- 
mental challenges in quantitatively measuring atomic-scale heat dissipa- 
tion have impeded the elucidation of a fundamental question: what is the 
relationship between the electronic transmission characteristics of atomic 
and molecular junctions (AMJs) and their heat dissipation properties? 
In this work, we overcome this challenging experimental hurdle by 
using custom-fabricated nanoscale-thermocouple integrated scan- 
ning tunnelling probes (NTISTPs; Fig. 1a and b). The NTISTPs feature 
an outer gold (Au) electrode that is electrically isolated but thermally 
well connected to the integrated gold-chromium thermocouple via a 
thin (70 nm) silicon nitride film (see Supplementary Information for 
fabrication details). To probe heat dissipation, we first created a series of 
AMJs (Fig. 1c) between the outer Au electrode of the NTISTP and a flat 
Au substrate. Application of a voltage bias across such AMJs results in a 
temperature rise of the integrated thermocouple due to heat dissipation 
in the NTISTP’s apex ona length scale comparable to the inelastic mean 
free path of electrons in Au (ref. 19). The power dissipation in the probe 
(Qp) and the temperature rise of the thermocouple (AT yc), located 
~300 nm away from the apex, are directly related by Qp = ATy¢/Rp 
(see Methods), where Rp is the thermal resistance of the NTISTP (see 
Fig. 1b). Further, AT yc is related to the thermoelectric voltage output of 
the thermocouple (AVyc) by AVrc = —SrcATrc, where Src is the 
effective Seebeck coefficient of the thermocouple. We note that Rp 
and Src were experimentally determined to be 72,800 + 500 Kw! 
and 16.3 + 0.2;1V K ‘, respectively (Supplementary Information). 
We began our experimental studies, at room temperature, by trap- 
ping single molecules of 1,4-benzenediisonitrile (BDNC; Fig. 1c) 
between the Au electrodes of the NTISTP and the substrate using a 
break junction technique*”’. We first obtained electrical conductance 
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view) and a Au substrate (bottom) along with a thermal resistance network 
(right) that represents the dominant resistances to heat flow. c, Diagrams of 
molecular and atomic junctions (top) along with the structures of the molecules 
studied (bottom). (All diagrams are not drawn to scale or proportion.) 
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versus displacement traces by monitoring the electrical current under 
an applied bias while the NTISTP-substrate separation was system- 
atically varied. Figure 2a shows representative conductance traces 
along with a histogram obtained from 500 such curves. The histogram 
features a peak at ~0.002Gp (Go = 2e°*/h ~ (12.9kQ)~'), which repre- 
sents the most probable low-bias conductance of Au-BDNC-Au junc- 
tions (Gau-BpNC-Au) and is in good agreement with past work”. 

To probe heat dissipation, we created stable Au-BDNC-Au junc- 
tions with a conductance that is within 10% of the most probable 
low-bias conductance”’. We studied heat dissipation in 100 distinct 
Au-BDNC-Au junctions, at each bias, to obtain the time-averaged 
temperature rise (ATtc,ayg) and the time-averaged power dissipation 
in the NTISTP (Qp.avg) for both positive and negative biases. Here, a 
positive (negative) bias corresponds to a scenario where the probe is 
grounded, while the substrate is at a higher (lower) potential. We note 
that a modulated voltage bias was applied to the junctions to obtain 
AT+c,avg—With high resolution—for both positive and negative biases 
(see Methods and Supplementary Information). This modulation 
scheme enables rejection of broadband noise and plays a critical role 
in performing high-resolution thermometry. 

The circles (triangles) in Fig. 2b represent the measured AT rc, avg 
as well as the estimated Qp.ayg for positive (negative) biases as a 
function of the time-averaged total power dissipation in the junctions 
(Qrotal.avg) at each bias voltage. Here, Qrotalavg represents all the 
power dissipated in the junction, at a given bias voltage, and can be 
readily obtained from the measured current (J) and the known voltage 
bias (V) applied to the junction (see Methods). We note that the cur- 
rent-voltage (I-V) characteristics of Au-BDNC-Au junctions are 
nonlinear (Fig. 2c), therefore, in general Qrota.avg ~ Gau-BDNC- gen. 
The dotted line in Fig. 2b corresponds to the expected temperature rise 
of the probe if the heating was symmetric, that is, if half of the total 
power was dissipated in the probe (ATsymmetric = Qrotal,ave/2Rp). It 
can be clearly seen that for a given Qrotalavg the power dissipation in 


the probe is larger under a negative bias than a positive bias. We also 
conclude that the time-averaged power dissipation in the substrate, 
Qs, avg is smaller under a negative bias than under a positive bias, because 
Qe.ave + Qs,avg = Qrotaavg To clarify the voltage biases used in the 
experiments, we present (in the inset of Fig. 2b) AT rc,ayg as a function 
of the magnitude of the applied voltage bias. These results unambigu- 
ously demonstrate that heat dissipation in the electrodes of Au-BDNC- 
Au junctions is bias polarity dependent and unequal. 

This observation raises an important question: why is the heat dis- 
sipation in the electrodes unequal in spite of the symmetric geometry 
of the molecular junctions? To address this question, we resort to 
the Landauer theory of quantum transport, which has successfully 
described charge transport in numerous nanostructures’’. Within this 
theory, the power dissipated in the probe and the substrate, Qp(V) and 
Qs(V), respectively, is given by”: 


QWVI=F | tp BEV Ifo AlAE 
Q(V=> | E-n)EV) lf —KldE (1) 


Here pup and ws are the chemical potentials of the probe and substrate 
electrodes, respectively, fp;s represent the Fermi-Dirac distribution of 
the probe/substrate electrodes, and t(E, V) is the energy (E) and voltage 
bias (V) dependent transmission function. Equation (1) suggests 
that the power dissipation in the two electrodes is, in general, unequal, 
that is, Qp(V)#Q<(V), and bias polarity dependent, that is, 
Qp;s(V) # Qp;s(— V). Specifically, it is straightforward to show that: 
Qo(V) — Qe — V)~2GTSV + O(V?) 
Qe(V) — Qs(V)~2GTSV + O(V*) (2) 


Figure 2 | Relationship between heat dissipation 


asymmetries and electronic transmission 
characteristics in Au-BDNC-Au junctions. 

a, Horizontally offset conductance traces (inset) of 
BDNC junctions, along with a histogram obtained 
from 500 traces (main panel). The red line 
represents a Gaussian fit to the histogram. b, Main 
panel, measured time-averaged temperature rise of 
the thermocouple (AT rc,ayg) and the time- 
averaged power dissipation in the probe (Qp avg) as 
a function of the time-averaged total power 

0 dissipation in the junction (Qrotal,avg) for positive 
and negative biases. Error bars represent the 
estimated uncertainty in AT yc avg (see 


Supplementary Information for details of 
uncertainty estimation). The computationally 
predicted relationship between Qp and Qrotai is 
shown by solid lines, which illustrates that 

Qp = fQrota where f is dependent on both Qroia 
and the polarity of the applied bias, and is in general 
not equal to 0.5. The dotted line corresponds to the 
expected temperature rise of the probe if the 
heating was symmetric (that is, f= 0.5). Inset, 
measured AT r¢,ayg as a function of the magnitude 
of the applied voltage bias. Uncertainties are not 
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shown in the inset, for visual clarity. c, -V 
characteristics of Au-BDNC-—Au junctions 
obtained by averaging 100 individual J-V curves 
(solid curve). The shaded region represents the 
standard deviation of the J-V curves. d, Computed 
zero-bias transmission function corresponding to 
the Au-BDNC-Au junction shown in the inset. 
HOMO, highest occupied molecular orbital; 
LUMO, lowest unoccupied molecular orbital. 
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Here, G is the low bias electrical conductance of the junction, T is the 
absolute temperature, and S is the Seebeck coefficient of the junction, 
whose sign is related to the first energy derivative of the zero-bias 
transmission t’(E = E;,V = 0) at the Fermi energy (E,), resulting in 
a positive Seebeck coefficient for a negative first derivative and vice 
versa’. To test if the observed heating asymmetry can be understood 
within this framework, we computed 1(E,V = 0) for Au-BDNC-Au 
junctions using a transport method™* based on density functional 
theory (DFT; Methods). The computed transmission function 
(Fig. 2d) exhibits a positive slope at the Fermi energy, in agreement 
with past work”, indicating a negative Seebeck coefficient, which by 
virtue of equation (2) leads to higher power dissipation in the NTISTP 
when negative voltages are applied to the substrate. Further, the solid 
lines in Fig. 2b represent the relationship between Qp and Qrotal 
(Qp + Qs = Qrotat) as computed from equation (1) under the assump- 
tion that t(E,V) is well approximated by t(E,V=0). Notice that 
although our DFT approach overestimates the linear conductance, it 
describes correctly the relationship between Qp and Qyota. The reasons 
for this agreement are discussed further in the Supplementary 
Information, where we show in particular that this relation is relatively 
insensitive to the details of the junction geometry. The good agreement 
of the computed and measured relation between power dissipations 
provides strong support to the applicability of the Landauer theory of 
heat dissipation at the atomic scale. 

To prove conclusively the relationship between electronic structure 
and heat dissipation, we performed additional studies on 1,4- 
benzenediamine (BDA; Fig. 1c) junctions, which are expected to exhibit 
hole-dominated electrical transport, as suggested by our calculations 
(Fig. 3d) and past experiments”®. Following a procedure similar to that 
described above, we first determined that the most probable low-bias 
conductance of Au-BDA-Au junctions was ~0.005Gp (Fig. 3a), a value 
consistent with past work’. Measurements of heat dissipation in BDA 
junctions (Fig. 3b) show a remarkably different asymmetry. In particu- 
lar, the BDA junctions show larger power dissipation in the probe for a 
positive bias than for a negative one—in strong contrast to that 
observed in BDNC junctions. To understand this important difference, 
we computed the transmission function of the Au-BDA~—Au junction 
displayed in Fig. 3d, which shows that t'(E = E;,V = 0) is negative, 
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resulting in a positive Seebeck coefficient. This, in turn, leads to larger 
power dissipation in the NTISTP at positive biases. Further, the com- 
puted relationship between Qp and Qrotai is in good agreement with our 
experimental observations (solid lines in Fig. 3b). 

Finally, to prove that no appreciable asymmetries are obtained if the 
transmission is weakly dependent on energy, we studied heat dissipa- 
tion in Au-Au atomic junctions. We began our analysis by studying 
the conductance of Au-Au atomic junctions, which were found to 
have a most probable conductance of ~Gpo, in accordance with past 
studies*”* (Supplementary Information). Subsequently, we created 100 
Au-Au atomic junctions with a low-bias conductance of Gp + 0.1Gp 
and probed heating in them. The measured AT yc avg (Fig. 4a) is seen to 
be proportional to Qrotal.avg and is identical for both positive and 
negative biases (within experimental uncertainty, ~0.1 mK), clearly 
demonstrating that there is no detectable asymmetry in the power 
dissipation. Further, additional experiments performed at larger values 
of QrotalAvg also show no detectable asymmetry (Fig. 4a inset). 

Symmetric heat dissipation is indeed expected in Au-Au atomic 
junctions because of the weak energy dependence of their transmission 
function”, which is reflected in the fact that their average thermo- 
power vanishes*. In Fig. 4b we present the computed zero-bias trans- 
mission, corresponding to the Au-Au atomic junction shown in the 
left inset. The transmission is practically energy independent over 1 eV 
around the Fermi energy. This weak energy dependence results in 
symmetric power dissipation (from equations (1) and (2)) as well as 
linear J-V characteristics, as evidenced by the experimentally obtained 
I-V curves shown in the right inset of Fig. 4b. 

The good agreement between the measured and computed asym- 
metries in the heat-dissipation characteristics of AMJs unambiguously 
confirms that heat dissipation is indeed intimately related to the trans- 
mission characteristics of the junctions, as predicted by the Landauer 
theory. We note that our results contradict recent claims’? of asym- 
metric heat dissipation in Au atomic junctions that are not in agree- 
ment with theoretical predictions. The insights obtained here 
regarding heat dissipation should hold for any mesoscopic system 
where charge transport is predominantly elastic. Such systems include 
semiconductor nanowires, two-dimensional electron gases, semi- 
conductor heterostructures, carbon nanotubes and graphene. 
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Figure 4 | No detectable heating asymmetry in 


Current (WA) 


Au-Au atomic junctions. a, The measured 

AT rcavg and Qp,avg in Au-Au atomic junctions 
for positive and negative biases as a function of 
Qrotal,avg (uncertainty of ATrcayg is <0.1 mK for 
all voltage biases). Inset, results of similar 
measurements for a larger range of powers 
(uncertainty is <0.1 mK and is imperceptible in the 
figure). The measured temperature rise is found to 
be linearly dependent on Qrotaavg and is 
independent of the bias polarity within 

i experimental uncertainty. Further, 
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METHODS SUMMARY 


Single-molecule and atomic junctions were created by displacing the NTISTP 
towards a Au substrate at 5nms ' and withdrawing from the substrate at 
0.1nms° | after contact formation (indicated by an electrical conductance greater 
than 5G). The Au substrate was coated with the desired molecules for molecular 
experiments and was pristine for the atomic junction studies. To obtain the con- 
ductance traces, a voltage bias of 100 mV was applied and the current was moni- 
tored during the withdrawal process. The obtained traces were analysed by 
creating histograms to identify the most probable conductance of AMJs. Stable 
single-molecule junctions with a desired conductance were created by stopping the 
withdrawal when a conductance plateau with a conductance within 10% of the 
most probable conductance was obtained. All the experiments were performed in 
an ultrahigh-vacuum scanning probe microscope at ambient temperature. 
Further, high-resolution temperature measurements were enabled by a modu- 
lation scheme where a time-dependent voltage, Vy(t), consisting of a periodic 
series of three level voltage pulses (+ Vy, 0 V, — V3 Supplementary Fig. 1) was 
applied to the AMJs while monitoring the thermoelectric voltage output of the 
NTISTP. The zero-bias transmission functions (Figs 2-4) were computed with the 
ab initio method described in ref. 24. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Creation of atomic and molecular junctions. All the AMJs were created between 
a NTISTP and a Au-coated substrate by displacing the NTISTP towards a Au 
substrate (which was coated with the desired molecules in molecular experiments 
and was pristine in atomic junction experiments) at 5nms | and withdrawing 
from the substrate at 0.1 nm’ after contact formation as indicated by an electrical 
conductance greater than 5G. To create the desired monolayers, 1 mM solutions of 
BDNC and BDA molecules, obtained commercially from Sigma Aldrich with a 
purity of ~99%, were created in toluene/ethanol. Subsequently, a Au-coated mica 
substrate (electron beam evaporation) was placed in one of the solutions to allow 
self-assembly of molecules on the Au surface. After exposing the substrates for 12 h 
in a glove box filled with nitrogen gas, they were rinsed in ethanol and dried in 
nitrogen gas. For the experiments involving Au-Au atomic junctions, the Au-coated 
substrates were cleaned in ultraviolet-radiation ozone to eliminate any organic 
contamination on the surface. The NTISTPs were also cleaned with ultraviolet- 
radiation ozone in all studies and loaded into the UHV scanning probe microscope 
instrument. The measurement of electrical current was performed using a current 
amplifier (Keithley 428), whereas thermoelectric voltage measurements were per- 
formed using a voltage amplifier (Stanford Research System 560). All the data were 
collected at a sampling frequency of 2 kHz using a data acquisition system (National 
Instruments 6281). The approach, withdraw, and hold sequences were accomplished 
by using a real-time controller (National Instruments PXI8110). 

Measurement of ATyc,ayg using a modulation scheme. High-resolution tem- 
perature measurements are enabled by a modulation scheme in which a time- 
dependent voltage, V(t), consisting of a periodic series of three level voltage 
pulses + Vy, 0 V, — Vm (Supplementary Fig. 1), is applied. In all the experiments 
performed in this work, the period (Tp) of the voltage pulses was chosen to be 
~0.08 s (1/12.25 Hz). The selected modulation frequency is found to optimize the 
signal-to-noise ratio and is experimentally feasible owing to the small thermal time 
constants (~10 pls) of the micrometre-sized NTISTPs, which enable high-fidelity 
tracking of temperature changes. The applied Vy,(t) results in both a modulated 
current (Iy,(t); see Supplementary Fig. 1) and a modulated temperature change of 
the thermocouple (ATw,rc(t)) due to Joule heating. Using the equation at the 
bottom of Supplementary Fig. 1, the time-averaged temperature rise correspond- 
ing to a positive bias AT c,avg(+ Vm) or a negative bias AT rc,ave(—Vm) can be 
directly related to the modulated thermoelectric voltage output (AVy;rc(t)) of the 
thermocouple. In probing heat dissipation in AMJs we applied the modulated 
voltage signal with an appropriately chosen amplitude Vy, for a period of ~5s 
to each AMJ. The resulting thermoelectric voltage signal AVyy;rc(t) was simulta- 
neously recorded. This was repeated on ~ 100 junctions to collect data for ~500 s 
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for each Vy. The obtained data were concatenated and analysed to estimate 
ATrc,avg Corresponding to positive and negative biases as described above. This 
modulation scheme enables temperature measurements with submillikelvin reso- 
lution, as described in the Supplementary Information. The time-averaged total 
power dissipation (Qrotal,avg), at each bias, was obtained by using the 500-s-long 
data corresponding to each Vy. Specifically, the data (measured current and 
known applied bias) were used to first compute the total heat dissipated at positive 
and negative biases. Subsequently, Qrotal,ave( + Vau/— Voz) was obtained by divid- 
ing the estimated total heat dissipation (corresponding to a positive or a negative 
bias) by the total time during which a positive bias (+ Vy) or negative bias (— Vy) 
was applied (~500/3 s). The amplitudes (V),) of the three level voltage pulses used 
in our studies were chosen to be 30 mV, 43 mV, 52 mV, 60 mV and 67 mV for Au- 
Au junctions; 0.74 V, 0.95 V, 1.08 V, 1.18 V and 1.27 V for Au-BDNC-Au junc- 
tions; and 0.44 V, 0.58 V, 0.68 V, 0.76 V and 0.82 V for Au-BDA~Au junctions. 
Representative traces obtained in the experiments are shown in Supplementary 
Information section 6.3. 

Estimating Qp,4,, from the measured AT y¢,ayg- To relate the temperature rise 
of the thermocouple to the time-averaged power dissipation in the probe Qp avg, it 
is necessary to quantify the thermal resistance of the NTISTP. To elaborate, 
consider the resistance network shown in Fig. 1b, where the thermal resistances 
to heat flow in the probe (Rp), junction (Ry) and substrate (Rs) are identified. Rp 
was experimentally determined to be 72,800 + 500K W | (see Supplementary 
Information). The thermal resistances of AMJs (Rj) are estimated to be at least 
10’ KW ' forall the AMJs studied here (see Supplementary Information for more 
details). Thus, Ry > Rp and therefore AT rc,avg depends only on the power dissip- 
ated in the tip and is unaffected by the heating in the substrate. Thus, from a 
knowledge of ATrc,avg and Rp, the time-averaged power dissipation, Qp,ayg, can 
be estimated as Qp,avg = AT rc,avg/ Rp. 

Computation of the transmission function. The zero-bias transmission func- 
tions shown in the manuscript were computed with the ab initio method described 
in detail elsewhere’. It is based on a combination of non-equilibrium Green’s 
function techniques and density functional theory (DFT) and was implemented in 
the quantum-chemistry software package Turbomole. More details can be found 
in the Supplementary Information. 

Computing the relationship between Qp and Qy ta. We computed the power 
dissipated in the probe Qp(V) and the total power dissipated in the junction Qrotal 
(Qp(V) + Qs(V) = Qrotai(V)) using equation (1) and the zero-bias transmission 
curves of the molecular junctions (shown in Figs 2d and 3d). Subsequently, Qp was 
plotted as a function of Qrotai as the relationship between Qp and Qyotai is robustly 
predicted by our calculations (see Supplementary Information for details). 
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Small effect of water on upper-mantle rheology 
based on silicon self-diffusion coefficients 


Hongzhan Fei', Michael Wiedenbeck’, Daisuke Yamazaki’ & Tomoo Katsura! 


Water has been thought to affect the dynamical processes in the 
Earth’s interior to a great extent. In particular, experimental 
deformation results'* suggest that even only a few tens of parts 
per million of water by weight enhances the creep rates in olivine 
by orders of magnitude. However, those deformation studies have 
limitations, such as considering only a limited range of water con- 
centrations and very high stresses, which might affect the results. 
Rock deformation can also be understood as an effect of silicon self- 
diffusion, because the creep rates of minerals at temperatures as high 
as those in the Earth’s interior are limited by self-diffusion of the 
slowest species”®. Here we experimentally determine the silicon self- 
diffusion coefficient Dg; in forsterite at 8 GPa and 1,600 K to 1,800 K 
as a function of water content Cy 0 from less than 1 to about 800 
parts per million of water by weight, yielding the relationship, 
Dg, ~ (Gy20)"2. This exponent is strikingly lower than that obtained 
by deformation experiments (1.2; ref. 7). The high nominal creep 
rates in the deformation studies under wet conditions may be caused 
by excess grain boundary water. We conclude that the effect of water 
on upper-mantle rheology is very small. Hence, the smooth motion 
of the Earth’s tectonic plates cannot be caused by mineral hydration 
in the asthenosphere. Also, water cannot cause the viscosity mini- 
mum zone in the upper mantle. And finally, the dominant mecha- 
nism responsible for hotspot immobility cannot be water content 
differences between their source and surrounding regions. 

Diffusion creep and dislocation creep are two important mecha- 
nisms that dominate the plastic deformation of rocks and minerals 
in Earth’s interior. Experimental deformation studies have suggested 
that incorporation of water in olivine significantly enhances both dis- 
location and diffusion creep rates’*’. However, we note that those 
studies used polycrystalline olivine samples with over-saturated water. 
In such samples, large amounts of free water may have existed on grain 
boundaries, leading to a large enhancement of grain boundary sliding 
(or pressure-solution-accommodated creep), rather than dislocation 
creep or diffusion creep in the grain interior. On the other hand, the 
upper mantle is water unsaturated and free water is unlikely to exist. 
Therefore, the enhancement of creep rates by free water cannot occur 
in the real upper mantle. We also note that the ranges of water contents 
(Cy20 < 80 wt p.p.m.) in these deformation studies’ *’ are too narrow 
to determine accurately the effect of water on stress-strain rate mea- 
surements. This can lead to large errors in estimating the effect of water 
on mantle rheology. 

Another problem with such rock deformation experiments is the 
very high stress (typically a hundred times higher than that in Earth’s 
interior) needed to obtain experimentally determinable strain rates. 
High stress causes anomalously high-density dislocations, stacking 
faults and sub-grain boundaries, which may lead to artificial results 
for the Earth’s interior. Instead, the measurement of self-diffusion 
coefficients in minerals is an independent way to study mantle rheo- 
logy because high-temperature mineral creep is believed to be con- 
trolled by self-diffusion of the slowest species*® (which is silicon in 
the case of olivine*’). It allows a much wider range of experimental 


conditions (such as pressure and Cy.9) and also does not induce 
unrealistically high defect densities. 

Costa and Chakraborty* measured silicon self-diffusion coefficients 
(Dg;) in olivine single crystals with Cy20 values of ~40 and 370 parts 
per million by weight (wtp.p.m.) and concluded that even 45 wt 
p-p.m. of water enhances Dg; by two to three orders of magnitude by 
comparison with the results obtained under dry conditions by ref. 10. 
However, the data of ref. 8 did not showa systematic change in Dg; with 
Cy20 at ~40 and at ~370 wt p.p.m. In addition, our previous study” 
showed that ref. 10 may have underestimated Dg; under dry conditions. 
We therefore propose that the water effect was overestimated in ref. 8. 

Here we systematically measured Dg; in olivine as a function of 
Cy20- Because the effects of iron on Dg; and on creep rates are very 
small under upper-mantle conditions''”, a single-crystal forsterite 
sample was used. We measured its Dg; at 8 GPa, 1,600 K and 1,800 K, 
and with well controlled Cy29 from <1 up to about 800 wt p.p.m., 
which is realistic for the oceanic mantle. The experimental details 
are given in the Methods section. 

Experimental results are shown in Fig. 1. Dg; systematically in- 
creases with increasing Cy20. Ds; values under wet conditions 
(Cy20 > 1 wt p.p.m.) were fitted to the Arrhenius equation: 


AH 
Dsi=AoCino xP (— pa) (1) 


where Ao is the pre-exponential factor, r is the Cy20 exponent, R is the 
gas constant, T is the absolute temperature, and AH is the activation 
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Figure 1 | Dg, versus Cy20 at 1,600 K and 1,800 K. The data points shown by 
small circles with an arrow are taken from ref. 11 on Dg; in dry forsterite at 

8 GPa, with Cy20 < 1 wt p.p.m.; these are below the detection resolution of FT- 
IR and SIMS. It was impossible to obtain data points at 1,800 K with high Cy20 
because of the low melting temperature of hydrous forsterite*'. Even when 
Cy20 was low, the isotopically enriched thin-film coating of the diffusion 
couple was often damaged during annealing at this temperature. CC08 
indicates data points (orange diamonds) taken from Costa and Chakraborty’, 
normalized to 1,600 K and 8 GPa, using an activation energy of 358 kJ mol ' 
from ref, 8 and an activation volume of 1.7 cm? mol ! from ref. 11. 
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enthalpy. By fitting the experimental results to equation (1), we 
determined Aj, r and AH to be 10 °8=°7m?s 1, 0.32 0.07 
and 434+ 20kJ mol ', respectively. The activation energy AE is 
420 + 23kJ mol! (after a pressure correction using an activation 
volume of 1.7 + 0.4cm? mol '; ref. 11), which is essentially the same 
as that for dry conditions (410 + 30kJmol'; see Supplementary 
Information)". 

The present results thus demonstrate Ds, « (Cy20) 
(Cy20)"”. Given that [Vs"’’’] (where four primes indicates a charge 
of minus four on the Si vacancies) is proportional to (Cy20) iS 
(ref. 13) under the charge-neutral conditions (where the dot indi- 
cates one positive charge on the hydroxide ion on the oxygen site), 
[((OH)o’] = 2[Vag'’] (ref. 14), a special explanation is necessary for 
the Cy20 exponent of Ds;. One hypothesis accounting for the Cy29 
exponent is that Si diffusion is controlled by Vo" as well as Vsi''’’. The 
Si** in forsterite is tightly surrounded by O* in a tetrahedron. If an 
oxygen ion is missing, the hopping probability of Vs;''"’ should greatly 
increase. Hence, Si diffusion may be dominated by Vo’ ’-associated 
Voi''"". Although the [Vo] is low, a certain proportion of Vgji’"”’ 
should be associated with Vo” owing to the Coulomb potential. As 
a result, Ds, should be proportional to both [V.;'"’"] and [Vo""]. Given 
that [Vo] « (Cy0) 1? (ref. 13), we have Dg x [Vgi/""] X [Vo"] 
x (Cy0)*”? X (C20)! = (Cr20)!7. However, we do not know 
what proportion of Vgi'’’’ is associated with Vo’’. It is possible that 
all the Vg;'"’’ are associated with Vo” because of the high Coulomb 
potential. In this case, Dg, would not be proportional to [V’’]. Further 
investigation is required to explain the observed small Cy9 exponent 
of Dg; in view of defect chemistry. 

Natural iron-bearing olivine in the real mantle can contain small 
amounts of Fey. , (where Fe** ona Fe or Mg metal site has an excess 
charge of +1) which may change the charge neutrality conditions and 
the Cy20 exponents for Ds; However, the Ds; obtained in natural 
olivine by ref. 8 under wet conditions at high pressure showed essen- 
tially the same increase with increasing Cy20 from 30-50 wt p.p.m. to 
370 wt p.p.m., as shown in Fig. 1. It suggests that Fey.” in natural 
olivine is not essential for Si diffusion in the investigated Cyy29 range. 

In the case that Cy9 is extremely high, the defect chemistry could 
be changed by incorporation of protons in Si vacancies and the 
hydrated Si vacancies—Hg;'’", (2H)si’’; (3H)si’ and (4H)s;‘) (where 
the superscript cross indicates no excess charge on the Si vacancy)— 
whose concentrations have a larger Cy29 exponent (that is, 0.5-2) than 
Vgi'""" (ref. 13), could dominate Si diffusion, possibly leading to a 
hydrolytic weakening of olivine’. For this reason, we expect Dg; to 
have a larger Cy0 exponent under high Cy20 conditions. However, 
our experimental results do not show an increase in the Cy20 exponent 
up to 800 wt p.p.m. Higher Cy.9 conditions are unlikely in the upper 
mantle except for in the mantle wedge, judging from petrological 
studies (~70-160 wt p.p.m. of water in depleted mantle’®, and a value 
four to five times higher in enriched mantle’”’*). Therefore, the Cu20 
exponent of 1/3 is the maximum for the majority of the upper mantle. 

Diffusion creep and dislocation creep in olivine under high tem- 
peratures are thought to be controlled by Si self-diffusion®*. Therefore, 
the Cy20 exponent for Ds; should be identical to that for creep rates. 
However, deformation studies'*’ on olivine aggregates claimed a 
much larger Cy20 exponent, 1.2 + 0.4 (Fig. 2). We found that the 
infrared spectra in these studies'* showed tiny sharp peaks with a 
broad band. This suggests that most of the water existed on grain 
boundaries. Accordingly, the high strain rates in their wet samples 
might have been caused by grain boundary sliding enhanced by free 
water on grain boundaries (see further discussion in the Supplemen- 
tary Information). This idea is also supported by the much lower creep 
rates obtained in single crystals of hydrous olivine’? than those in 
polycrystalline (Fig. 2). Free water is unlikely to be present in the upper 
mantle (except for the mantle wedge), owing to the water-unsaturated 
conditions. In addition, the grain size is on the order of millimetres to 
centimetres in the upper mantle”, meaning grain boundary sliding 
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Figure 2 | Strain rate versus Cy. Ds values from this study are converted to 
strain rate using the proportional relationship of Ds; and strain rate° with 
parameters from ref. 13. All data are normalized to a pressure of 8 GPa, a 
temperature of 1,600 K, and a stress of 300 MPa using an activation volume of 
1.7cm? mol! (ref. 11), activation energy of 420 kJ mol |, anda stress 
exponent of 3.5. The data points for Cy29 < 1 wt p.p.m. are treated in the same 
way as in Fig. 1. 


would be negligible’. Therefore, the creep rates of minerals in the real 
mantle cannot be enhanced by free water on grain boundaries. 

Based on the small Cy29 exponent (r= 1/3) determined in this 
study, the difference in Ds;, as well as creep rates, between rheologically 
dry (<1 wtp.p.m.) and maximum Cy of olivine in upper mantle 
(<1,000 wt p.p.m.; refs 16-18) is within one order of magnitude. 
Because the variance of Cy20 in the upper mantle is very small, that 
is, ~100-1,000 wt p.p.m. (refs 16-18), such a small range only causes 
~0.3 orders of magnitude difference in creep rates. This is much 
smaller than other factors that affect rheological properties like tem- 
perature or shear stress. Hence, we conclude that the effect of water on 
upper mantle rheology is not significant, which is in complete contrast 
to what has been commonly accepted to be the case”*??”?. 

This small effect of water on upper-mantle rheology means that 
many geodynamical problems must be reconsidered. Two ideas, par- 
tial melting and hydration”*'”’, have been commonly considered to 
explain plate motion because both could soften the oceanic astheno- 
sphere. Previous overestimates of water effects on creep rates have 
erroneously supported the idea that hydration is the main reason for 
plate motion’”’”*. Using the Cy20 exponent of 1/3, if 75% of the 
original water is extracted during mantle dehydration (~110 wt p.p.m. 
of water before dehydration’®, and ~28 wt p.p.m. after dehydration”), 
the creep rates change only by a factor of 1.6. On the other hand, 
the melt fraction in the asthenosphere is estimated to be 1.25-0.25% 
(ref. 25) or less”®. Such a small melt fraction enhances the creep rates by 
at most a factor of three. However, the high geothermal gradient 
in the oceanic mantle at <200 km, and especially at <100 km (about 
12Kkm— Ee causes the creep rates to increase by at least six orders of 
magnitude from a depth of 60 km to a depth of 200 km. Thus, the effect 
of temperature gradient on creep rates appears to be much larger than 
that of Cy20 or melt fraction. The softening of the oceanic astheno- 
sphere that allows plate motion cannot occur by hydration or by partial 
melting. 

In addition, the presence of a minimum-viscosity zone has been 
expected in the asthenosphere based on the seismically observed 
low-velocity and high-attenuation zone**. However, because the effect 
of pressure on Dg; is also small’’, the viscosity in the upper mantle 
(which is calculated using the inverse relationship between Dg; and 
viscosity” based on oceanic geotherm”’) decreases monotonically with 
increasing depth (Fig. 3) even if the geothermal gradient is very small 
(that is, <1 Kkm~ 1) at a depth exceeding 200 km. Thus, on the basis of 
the values of Ds; and taking the effects of pressure, temperature, and 
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Figure 3 | Viscosity in the upper mantle. Viscosity 7 is calculated from Ds; 
using the inverse relationship of 7 and Dg; (ref. 29), that is, 1 = 10kTr2/(Dsma), 
where k is the Boltzmann constant, T is the absolute temperature based on the 
oceanic geotherm”, r. is the crystal radius, and m, is the mass ofa Siion. The grain 
size in the mantle is assumed to be ~1 mm. Dg; is a function of temperature, Cy29 
and pressure, as given by equation (1) with AH = AE + PAV, for which 
activation energy AE and activation volume AV values of 420 kJ mol”! and 
1.7cm? mol | were used!!, respectively. The influence of partial melting on 
viscosity is calculated from the melt fraction dependence of creep rates”. 


water content into account, the minimum-viscosity zone does not 
appear in the asthenosphere. 

Finally, an open question in mantle dynamics is why hotspots are so 
immobile in the face of plate motion. If it were true that water has 
a large effect on mantle rheology, high values of Cyy9 in the source 
regions of hotspots in comparison to that in surrounding regions 
would be a possible explanation. However, our results demonstrate 
that this idea is not valid. Taking the Hawaii hotspot as an example, the 
Cy20 in its source is ~750 wt p.p.m., and ~110 wt p.p.m. in the sur- 
rounding regions’’. Our results indicate that this difference would 
cause a viscosity contrast of a factor of two, which is rather small in 
comparison with that caused by temperature difference (~200K 
hotter than surrounding mantle*’, resulting in a viscosity decrease 
by more than one order of magnitude). Hence, the Cy20 contrast 
cannot be the major reason for the immobility of hotspots. 


METHODS SUMMARY 


A synthetically produced forsterite single crystal was cored into disks with diameter 
1 mm and thickness 1 mm, which were used as the starting materials. The chemical 
composition of the crystal was Mg,SiO, with Ir the major impurity (~80 wt p.p.m.). 
The cored disks were doped with water at 8 GPa and 1,600 K using talc + brucite as 
a water source with enstatite + graphite/gold powder, using a multi-anvil appa- 
ratus. The variation of water contents in the samples were made by varying the ratio 
of the water source to the enstatite + graphite/gold powder. After being carefully 
polished in an alkaline colloidal silica solution, each water-doped disk was coated 
with a ~500nm **Si-enriched Mg,SiO, thin film by a pulsed laser deposition 
system, and then annealed at 8 GPa and 1,600 K or 1,800 K for diffusion. The water 
contents in the samples were determined using Fourier transform infrared (FT-IR) 
spectroscopy both before and after diffusion annealing. The sample surfaces with 
the thin film were polished again in the alkaline solution after diffusion annealing to 
reduce their surface roughness, which could lead to large analytical uncertainties in 
diffusion profile analyses. The diffusion profiles were obtained using a Cameca 6f 
secondary ion mass spectrometer (SIMS). The diffusion coefficients were calculated 
by fitting the profiles to the solution of Fick’s second law. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Starting material. A single-crystal forsterite sample was obtained from Oxide 
Company, Japan. The chemical composition of the crystal is Mg,SiOy. Its trace- 
element compositions were obtained from ref. 11. No O-H absorption bands were 
detected by Fourier transform infrared (FT-IR), indicating that the water content 
was less than 1 wtp.p.m. We used disks cored from the crystal, with diameter 
1mm and thickness 1 mm and the thickness oriented along the b axis. 
Water-doping experiments. The cored forsterite disks were pre-annealed at 
8 GPa and 1,600K in the presence of a water source. This step is necessary to 
equilibrate the water in the crystal before diffusion annealing. 

Each forsterite disk was loaded into a platinum capsule, with an outer diameter 
of 2.0 mm and an inner diameter of 1.6 mm, with one end sealed. A mixture of talc 
and brucite powders (weight ratio 4:1) was used as the water source and also to 
control the silica activity in the capsule. The space between the forsterite disk and 
capsule wall was filled with graphite or gold + enstatite (weight ratio 35:1) powder 
for low- and high-water-content experiments, respectively, to protect the single 
crystal from mechanical damage at high pressure (Supplementary Fig. 1). The 
capsule was closed and sealed by arc welding in liquid nitrogen to minimize water 
escape from the capsule. The water content in the capsule was controlled by the 
ratio of water source to graphite or gold + enstatite. In dry experiments, graphite 
powder was loaded around the samples; the capsules were then dried in a vacuum 
oven at 470 K for at least 24h and sealed on a hot plate to minimize the amount 
of moisture absorbed from the atmosphere. The final length of capsules was 
4-4.5 mm. 

High-pressure experiments were performed using a Kawai-type multi-anvil 
apparatus at the University of Bayreuth. All experiments were performed at 
8 GPa and 1,600 K. In each run, the sealed platinum capsule was located in an 
MgO cylinder in a LaCrO; stepped heater with a ZrO, thermal insulator. A MgO 
octahedron (with 5 wt% Cr.03) with edge length 14 mm was used as the pressure 
medium (Supplementary Fig. 1). Eight tungsten carbide cubes with 32-mm edge 
length and 8-mm truncation edge length were used to generate high pressures. The 
temperatures were measured using a W97%Re3%-W75%Re25% thermocouple, 
0.25 mm in diameter, whose junction was placed at the bottom of the capsule. The 
assembly was compressed to the target pressure over 2-4 h, heated to 1,273 K ata 
rate of 50K min |, kept at 1,273 K for 1h to decompose talc and brucite and to 
make the water distribution homogenous in the capsule, the assembly was then 
heated to 1,600 K in 5 min and kept for a long duration for water equilibration 
(50-70h), as calculated from the hydrogen diffusion coefficients in forsterite’. 
The temperature was under automatic control, thus limiting variation to less than 
2 K during annealing. After annealing, the sample was quenched by switching off 
the heating power and gradually decompressed to ambient pressure over a long 
period (15-20 h) to prevent crystal breakage. 

The forsterite disks were recovered by cutting into the platinum capsule using a 
steel blade. No obvious cracks were found in the samples if small amounts of water 
source were used. With high amounts of water source, the crystal always contained 
some cracks and broke into pieces. However, in such cases we were still able to find 
usable pieces for diffusion experiments. 

Deposition. The water-doped samples were polished using diamond powders 
with grain sizes of 0.25 um, followed by an alkaline colloidal silica solution for 
>3h until all small scratches were removed. The highly polished surface was then 
coated with ~500 nm of *’Si enriched Mg,SiO, and 100 nm of ZrO, using a pulsed 
laser deposition system at the Ruhr-University of Bochum’. We also conducted 
some diffusion experiments without the ZrO, film for comparison, and showed 
that the ZrO, does not affect Dg;, which was already confirmed in our previous 
study''. Prior to each deposition, the samples were heated up to 470K for 
10-15 min in the vacuum chamber of the pulsed laser deposition system so as 
to remove any free water from the sample surface. The structural water in the 
crystals did not escape during this step. 

Diffusion annealing. Each thin-film-coated sample was placed in a platinum 
capsule with the same ratio of water source and graphite or gold + enstatite as 


used for the corresponding water-doping experiment and was then annealed at 
8GPa and 1,600K or 1,800K using the same high-pressure assembly (Sup- 
plementary Fig. 1). The annealing durations, ranging from 5-41 h as summarized 
in Supplementary Table 1, were estimated from silicon diffusion coefficient data 
for olivine® and forsterite’’. 

FT-IR analysis. The water contents in the samples after water-doping experi- 
ments and also after diffusion annealing were measured using a high-resolution 
FT-IR spectrometer at the University of Bayreuth, described in ref. 11. Each 
forsterite sample for FT-IR analysis was polished on both faces normal to the b 
axis using 0.25-tym diamond powder. Two hundred scans were accumulated for 
each spectrum at a resolution of 1 cm” '. Two or three spectra were obtained for 
each sample with at least one near the centre of the disk and one near the edge. One 
sample (V720) was also polished parallel to the b axis, and the water content was 
obtained as a function of distance from the coated thin film at 60-um steps. After a 
background baseline correction and thickness normalization to 1 cm, the water 
contents were determined using the calibration given by** 


Cyp0 =0.188 x | Hovay (2) 


where Cy20 was the water content in wtp.p.m. and k(v) was the absorption 
coefficient at wavenumber v. Integration was performed between 3,000 cm! 
and 4,000 cm! (ref. 11). The results of Cyy29 in the samples are shown in the 
Supplementary Information. 

SIMS analysis. The apparent diffusion profiles were measured by secondary ion 
mass spectrometry (SIMS) depth profiling using the Cameca IMS-6f installed at 
the Helmholtz Centre in Potsdam, Germany, with the same set-up for determining 
Dg; in dry forsterite as in our previous study'’. The depth of each SIMS crater was 
determined using a 3D-Nanofocus vertical microscope at the University of 
Bayreuth. The Dg; was obtained by fitting the data to the solution of Fick’s second 
law 

overs x—h otc¢y 


f 3 
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where c is the observed abundance of *’Si, c, is the initial abundance of *’Si in the 
isotopic film, co is the initial abundance of ?9¢i in the substrate, x is the distance 
from the surface, h is the position of the boundary between the thin film and 
substrate, t is the annealing time, L(a) is the nominal diffusion length in zero-time 
diffusion runs related to surface roughness (discussed below), and erf(z) is the error 
function". An example of the diffusion profiles is shown in Supplementary Fig. 2. 
Surface problem. Because of the crystallization of thin films, the surface rough- 
ness significantly increased during high-temperature annealing and became the 
major analytical uncertainty source’’. Hence, the sample surfaces after diffusion 
annealing were chemically polished in an alkaline colloidal silica solution until the 
roughness was reduced to <50nm, measured with a 3D-Nanofocus vertical 
microscope at University of Bayreuth. Only a thin layer (<200 nm), located well 
beyond the apparent diffusion profile, was removed during the final chemical 
polishing’’. In addition, the apparent diffusion lengths obtained by SIMS were 
also corrected using a roughness calibration line obtained by a series of zero-time 
runs (equation (3)), in which the nominal diffusion lengths L are approximately a 
linear function of the standard deviation o of the surface roughness at the bottoms 
of the craters (Supplementary Fig. 3). Detailed discussion about the surface pro- 
blem is given in ref. 11. 
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Early-life dietary transitions reflect fundamental aspects of primate 
evolution and are important determinants of health in contem- 
porary human populations’”. Weaning is critical to developmental 
and reproductive rates; early weaning can have detrimental health 
effects but enables shorter inter-birth intervals, which influences 
population growth’. Uncovering early-life dietary history in fossils 
is hampered by the absence of prospectively validated biomarkers 
that are not modified during fossilization*. Here we show that large 
dietary shifts in early life manifest as compositional variations in 
dental tissues. Teeth from human children and captive macaques, 
with prospectively recorded diet histories, demonstrate that barium 
(Ba) distributions accurately reflect dietary transitions from the 
introduction of mother’s milk through the weaning process. We 
also document dietary transitions in a Middle Palaeolithic juvenile 
Neanderthal, which shows a pattern of exclusive breastfeeding 
for seven months, followed by seven months of supplementation. 
After this point, Ba levels in enamel returned to baseline prenatal 
levels, indicating an abrupt cessation of breastfeeding at 1.2 years of 
age. Integration of Ba spatial distributions and histological map- 
ping of tooth formation enables novel studies of the evolution of 
human life history, dietary ontogeny in wild primates, and human 
health investigations through accurate reconstructions of breast- 
feeding history. 

Weaning, the dietary transition from breast milk to exclusive solid 
food intake, concludes several years earlier in modern humans than in 
other great apes*®. Cross-cultural studies of nonindustrial societies 
reveal remarkable variation in weaning practices’. However, among 
non-human primates, dietary transitions remain understudied*’. In 
addition to the paucity of comparative primate data, our understand- 
ing of the evolution of human weaning has been limited by difficulties 
in assessing the precise timing and nature of dietary transitions during 
infancy’. Dental hard tissues are particularly valuable for reconstruct- 
ing diet as they contain precise temporal and chemical records of early 
life*. Teeth begin forming in utero, record birth as the neonatal line, 
and manifest daily growth lines, which allow chronological ages to be 
determined at various positions within tooth crowns and roots (Sup- 
plementary Fig. 1). 

We propose that micro-spatial analysis of barium/calcium ratios 
(Ba/Ca) in dental tissues represents a powerful approach to assess 
dietary transitions. Whereas prenatal Ba transfer is restricted by the 
placenta, marked enrichment occurs immediately after birth from 
mother’s milk or infant formulas, which contain higher Ba levels than 
umbilical cord sera’®. In response to these variations in dietary Ba 
exposure, Ba/Ca in enamel and dentine should increase at birth, 
remain elevated for the duration of exclusive breastfeeding and rise 
further with introduction of infant formula. Circulating Ba levels are 


expected to change at weaning as Ba (and Ca) content and bioavail- 
ability is markedly different across plant and animal food sources!"””. 
To test this hypothesis, we investigated Ba/Ca patterns in teeth from 
human children for whom early life diets were recorded prospectively, 
and in teeth from captive macaques in which maternal milk was col- 
lected and suckling behaviour observed. 

High-resolution elemental analysis by laser ablation-inductively 
coupled plasma-mass spectrometry revealed marked Ba/Ca increases 
in enamel and dentine formed immediately after birth in human 
deciduous teeth (m = 22 of 25 individuals) (Fig. la—c). In 9 of 13 chil- 
dren who were initially breastfed and given infant formula later, two 
distinct zones of Ba/Ca distribution were apparent in postnatal regions 
formed before crown completion (Fig. 1d, e). Histological analysis 
(Supplementary Fig. 2) revealed a close correspondence between the 
formation time of the first zone and maternal reports of exclusive 
breastfeeding. Four individuals who continued to consume breast milk 
for a long period (9-42 months) after the introduction of formula at 
1-2 months did not show two distinct Ba/Ca zones in enamel or dent- 
ine. In children for whom formula was introduced almost immediately 
after birth and who were breastfed for less than 1 month (Fig. le), the 
first Ba/Ca zone immediately adjacent to the neonatal line was nar- 
rower than in infants exclusively breastfed for longer. Individuals who 
were exclusively breastfed during the entire period of tooth crown 
formation (n = 7 of 25; Fig. 1f) showed an increase in Ba/Ca across 
the neonatal line, but as expected, no subsequent Ba/Ca zoning was 
apparent in postnatally formed dentine (as seen in infants who made a 
transition from breast milk to formula). Thus, Ba enrichment provides 
unambiguous evidence for postnatal feeding, as well as the beginning of 
supplementation; however, the transition from exclusive breastfeeding 
to formula intake may be obscured when breast milk remains the pre- 
dominant dietary component after formula introduction. The extent of 
Ba/Ca increase at birth varies due to inter-individual differences in 
breast milk and formula Ba content. This is illustrated in Fig. 1f where 
the rise in Ba/Ca in response to breastfeeding is lower than in other 
individuals (Fig. 1d, e). Data on Ba/Ca values are given in Supplemen- 
tary Table 1 and Supplementary Fig. 3. 

Macaque permanent first molars also showed clear distinctions in 
Ba/Ca between pre- and postnatal regions, and close correspondence 
of postnatal changes in dental tissues and mother’s milk (Supplemen- 
tary Fig. 4). Although more diffuse owing to the nature of minerali- 
zation, Ba/Ca patterns in enamel correlated closely with dentine. 
Temporal mapping revealed Ba/Ca increases for the first 3-3.5 months 
of postnatal life (Fig. 2, Supplementary Figs 4-6 and Supplementary 
Table 2), followed by decreases that correlated with declines in suck- 
ling time and the initiation of solid food consumption. Moreover, 
Ba/Ca decreased more gradually during natural weaning than in 


1Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA. @Environmental and Occupational Medicine and Epidemiology, Harvard School of Public 
Health, Boston, Massachusetts 02115, USA. “Institute of Dental Research, Westmead Millennium Institute, Westmead Hospital, and Oral Pathology and Oral Medicine, Faculty of Dentistry, University of 
Sydney, Sydney, New South Wales 2145, Australia. “Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA. Center for Environmental Research and 
Children’s Health, School of Public Health, University of California, Berkeley, California 94720, USA. ®California National Primate Research Center, Davis, California 95616, USA. Southern Cross 
GeoScience, Southern Cross University, Lismore, New South Wales 2480, Australia. ®Elemental Bio-imaging Facility, University of Technology Sydney, Sydney, New South Wales 2007, Australia. °The Florey 
Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Victoria 3010, Australia. 


*These authors contributed equally to this work. 


216 | NATURE | VOL 498 | 13 JUNE 2013 


©2013 Macmillan Publishers Limited. All rights reserved 


Ba/Ca x 10~¢ 


Ba/Ca x 10-4 


Figure 1 | Barium distribution in human deciduous teeth. a, Ba/Ca map of 
incisor. Dentine horn is indicated by an arrowhead. b, Area highlighted in a and 
polarized light micrograph. In dentine (D), Ba/Ca levels show a marked 
increase coinciding with the neonatal line (white arrowheads). The neonatal 
line in enamel (E) is indicated by black arrowheads and the enamel-dentine 
junction by arrows. c, Ba/Ca measured adjacent to the enamel—dentine junction 
from dentine horn to cervix of a, which rose at birth and with the introduction 


individuals who experienced truncated weaning periods (Fig. 2). In the 
most extreme case, an individual separated from its mother for several 
weeks at 166 days of age, precipitating cessation of milk synthesis and 
mammary gland involution, showed an abrupt Ba/Ca drop (Fig. 2c), 
which was independently estimated at 151-183 days of age. 

Building upon our prospectively validated human and maca- 
que results, we precisely documented diet transitions in a juvenile 
Neanderthal’’. Barium is incorporated into the mineral phase (hydro- 
xyapatite) during tooth calcification, which occurs rapidly after secre- 
tion in dentine, and more slowly and diffusely in enamel during 
maturation'*'*. Trace elements such as Ba are more resistant to post- 
mortem diagenetic alteration in enamel than in dentine, due in part to 
the greater original mineral content and lack of natural pores’®. Thus, 
the distribution of Ba/Ca in well-preserved tooth enamel may yield 
direct information on early-life dietary transitions in fossil hominins. 

Chemical and temporal mapping of Neanderthal first molar enamel 
(Fig. 3) revealed a transition pattern similar to the macaque that 
weaned abruptly. After approximately 13 days of prenatal enamel 
formation, Ba/Ca near the enamel-dentine junction increased and 
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of infant formula (21-24 days). The x axis shows days since birth (B). d-f, Three 
diet patterns: breastfeeding for 3 months (dotted white line) followed by 
exclusive formula feeding (solid black line) (d); formula introduced within 

1 week of birth (solid black line) (e); and exclusive breastfeeding (dotted white 
line) (f). The neonatal line is indicated by a dashed white line. Intensity indices 
are Ba/Ca X 10 *. High Ba/Ca ratios adjacent to pulp (red zone) are in secondary 
dentine, a later-forming region not relevant to the current study (see ref. 12). 


remained elevated until approximately 227 days of age (~7.5 months), 
followed by intermediate values until 435 days of age (1.2 years). After 
this age Ba/Ca rapidly returned to prenatal levels for the final 1.15 years 
of crown formation. The Ba/Ca patterns in enamel were not observed 
in dentine due to diagenetic modification after death. However, dia- 
genesis did not seem to have a significant influence on enamel, as con- 
centrations of diagenetic indicators’” were low (Supplementary Table 3 
and Supplementary Fig. 7). Furthermore, enamel Ba/Ca levels were 
similar to published values for other hominins'’'’, and Ba/Ca shifts 
were similar in form and timing between both mesial cusps, suggesting 
that the transition represents biogenic input rather than post-mortem 
modification (Supplementary Discussion). Although the subsurface 
occlusal and cervical enamel appears to show minor cracks that may 
lead to local modification!’, most of the tooth crown is intact and 
naturally coloured. The Scladina individual has also yielded mtDNA 
and enamel proteins’””®, indicating that it is a well-preserved fossil. 
Strontium/calcium ratios (Sr/Ca) in tooth enamel have been inter- 
preted to reveal dietary transitions in baboons and humans”’. 
However, these events were inferred from species-typical norms or 
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Figure 2 | Barium distribution reveals natural and truncated weaning. 

a, Macaque 515: natural weaning after 296 days. b, Macaque 152: weaned slightly 
early due to maternal separation at 257 days. c, Macaque 401: markedly truncated 
weaning owing to maternal separation at 166 days. This individual’s weight 
fluctuated during the final 7 months of life due to illness; post-weaning 
enrichment may be owing to release from skeletal stores*’. Diet transitions: 
prenatal regions (arrowhead), exclusive mother’s milk (MM), transitional (T) 
periods, and post-weaning regions delineated in enamel (dotted lines) and dentine 
(black arrows). The enamel-dentine junction is indicated with a dashed line. The y 
axis shows enamel Ba/Ca adjacent to the enamel—dentine junction. The x axis 
shows days since birth (B) and weaning (red line). Elemental maps of dentine and 
enamel were rendered on different scales to show Ba/Ca transitions clearly. 
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recalled retrospectively years after the event, which may be subject to 
significant recall bias’. In light of this, and concerns that Sr might be 
more susceptible to diagenetic alteration than Ba due to its higher 
diffusivity?***, a posthoc comparison of Sr/Ca and Ba/Ca was con- 
ducted. We found that the reconstruction of diet history from Sr/Ca 
mapping was impeded due to proportionately smaller changes in 
Sr levels across transitions and inconsistent patterns in human and 
macaque samples (Supplementary Figs 3, 8 and 9, Supplementary 
Tables 4 and 5, and Supplementary Discussion). Two distinct regions 
between birth and 1.2 years were observed in the Neanderthal tooth for 
Ba/Ca and Sr/Ca, representing exclusive breastfeeding and solid food 
supplementation, although this is less clear from Sr/Ca when compared 
to Ba/Ca (Supplementary Fig. 10). Thus, Ba/Ca provides greater reso- 
lution of dietary transitions than Sr/Ca in extant and fossil material. 
Nonetheless, measurements of Sr isotopes in enamel have yielded 
useful data on diet and migration in early hominins"*. 
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Figure 3 | Dietary transitions in a Neanderthal permanent first molar. 

a, Developmental time (in days from birth) of stress lines in enamel (dark blue 
lines) was determined from daily growth increments (following dotted blue 
lines). Scale bar, 1 mm. b, Ba/Ca map shows marked variations in enamel at 
birth, 227 and 435 days, which resemble human and macaque transitions from 
exclusive maternal milk (MM) consumption to supplementation. c, Ba/Ca in 
enamel adjacent to the enamel-dentine junction. The x axis shows days from 
birth (B) to proposed exclusive MM, transitional diet (T) periods and 
hypothesized weaning event (red line). Elevated Ba/Ca levels at the very 
beginning and end of crown formation are probably due to subtle diagenetic 
modification’’. 


We have shown a direct correlation between Ba/Ca distributions in 
human deciduous teeth and breastfeeding data collected prospectively, 
thereby avoiding recall bias. In the macaques, patterns of suckling 
behaviour and Ba concentration in mother’s milk are consistent with 
Ba/Ca in dental tissues, which consistently show a decrease in Ba/Ca 
from the onset of supplementation. Taken collectively, these results 
demonstrate that Ba/Ca in teeth effectively reflect Ba intake via 
mother’s milk, and can be used to document developmental transi- 
tions in future studies of wild primate skeletal material, and for assess- 
ments of human health outcomes. 

In the Scladina Neanderthal, the protracted weaning process typical 
in primates was interrupted by unknown cause(s), precipitating abrupt 
cessation of suckling. The period of exclusive breastfeeding in this 
Neanderthal is consistent with other hominoids; human hunter- 
gatherers and wild chimpanzees also begin to supplement milk with 
solid food by around 6 months of age**’. Humans and chimpanzees 
may wean offspring as early as 1.0 and 4.2 years, respectively, without 
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serious health effects, but average 2.3-2.6 years’ and 5.3 years. When 
applied to additional samples, our approach will allow the evaluation 
of hypotheses that Neanderthal young routinely weaned at later ages 
than Upper Palaeolithic hominins”’, or possessed faster life histories 
than modern humans”, which have important implications for models 
of hominin population growth and species replacement. 


METHODS SUMMARY 


Human teeth were supplied from the Center for the Health Assessment of Mothers 
and Children of Salinas study, Monterey County, California, USA”. Pregnant 
women were recruited before 20 weeks gestation, and data on breastfeeding and 
use of infant formulas were prospectively collected. From the 7-year assessment 
onwards, mothers were asked to bring in a tooth the child had shed, which was 
prepared according to standard histological techniques. The neonatal line was 
used to identify pre- and postnatal developmental periods. Prominent long-period 
incremental lines were mapped, and daily growth cross-striations in enamel were 
measured to determine the average daily enamel secretion rate. Macaque samples 
were obtained from two mother-infant dyads and two additional juveniles at the 
California National Primate Research Center (CNPRC), UC Davis, California, 
USA. Mothers and infants were captured for milk collection and morphometric 
measurements three times during lactation. Methods for rhesus macaque milk 
collection are described elsewhere’. In the week before milk collection, observa- 
tions of infant suckling behaviour were recorded’*. Dentitions were collected 
opportunistically during animal necropsy in conjunction with the CNPRC Bio- 
logical Specimens Program. First molars were dissected out after fixation, and 
histological sections were prepared and analysed following established protocols’’. 
The Scladina Neanderthal upper first maxillary molar had been previously sec- 
tioned and temporally mapped". Laser ablation-inductively coupled plasma-mass 
spectrometry (LA-ICP-MS) was used for elemental analysis of all samples 
according to published protocols”’. Instrument parameters were selected to gen- 
erate images with pixel sizes of approximately 900 jum. Reported element ratios 
(Ba/Ca X 10 * and Sr/Ca X 10 °) were calculated from concentrations deter- 
mined using Ca, *8sr and /8Ba with standard NIST 1486 bone meal. Other 
elements were quantified against NIST 612 as a standard. Changes in Ba/Ca were 
assigned ages by overlaying photomicrographs from histological temporal maps, 
which were registered along the enamel-dentine junction. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Human study participants. We used teeth from children enrolled in the Center 
for the Health Assessment of Mothers and Children of Salinas (CHAMACOS) 
study in Monterey County, California*’*’. Pregnant women in the CHAMACOS 
cohort were recruited before 20 weeks gestation, and data on breastfeeding and 
infant formulas used were prospectively collected. Interviews were conducted with 
participants twice during pregnancy (at the end of the first and second trimesters), 
immediately postpartum, and when children were approximately 6, 12, 24 and 
42 months old. Interviews were conducted in person, either at the study office or in 
a modified recreational vehicle that was used as a mobile office at the participant’s 
home. All questionnaires were administered in English or Spanish by trained 
bicultural interviewers, with most interviews (94%) conducted in Spanish. Study 
instruments were developed in English, translated, and validated by Mexican- 
American immigrant staff members familiar with the language of the community 
and of southern Mexico from where many participants migrated. 

At the second pregnancy interview (mean = 27 weeks gestation), the partici- 
pant was asked if she intended to breastfeed her child. At each of the postpartum 
interviews she was asked if she was currently breastfeeding. At the interview when 
the mother first answered that she was no longer breastfeeding, she was then asked 
the child’s age when she had completely stopped breastfeeding and the reasons for 
stopping. Additionally, at the 6-month interview, the mother was asked if her child 
was receiving formula, and if so, at what age formula had been introduced. At the 
12-month interview, she was asked at what age formula, solid foods and cow’s milk 
were each introduced. Duration of exclusive breastfeeding was defined as the 
period between birth and the age when food or liquid other than breast milk or 
water was first given. All procedures were reviewed by the University of California 
at Berkeley Committee for the Protection of Human Subjects. Written informed 
consent was obtained from parents of all participating children and oral assent was 
obtained from 7 year olds. 

From the 7-year assessment onwards, mothers were asked to bring in a tooth the 
child had shed. We randomly selected deciduous teeth that were free of obvious 
defects (caries, hypoplasias, fluorosis, cracks, extensive attrition) from 25 children 
who fell into one of three categories: exclusively breastfed from birth; initially 
breastfed with formula introduced within 1-2 months of birth; or exclusively 
formula fed soon after birth. We prepared ~100-150-y1m-thick sections in an 
axial labio-lingual plane following established methods. Developmental times 
were assigned to marked shifts in Ba/Ca in tooth sections with histological ana- 
lyses. We photographed the enamel-dentine junction and the neonatal line in 
enamel and dentine. We overlaid these photomicrographs on our elemental maps 
to distinguish pre- and postnatal regions (Fig. 1). In teeth of children whose 
mothers introduced formula within 1-2 months of birth, we noticed clear high 
Ba/Ca bands in the postnatally formed dentine some distance from the neonatal 
line. To assign a developmental time to these zones, we used polarized light micro- 
scopy to visualize prominent long-period incremental lines and cross-striations 
(daily growth increments) in enamel, and measured the distance between consecu- 
tive cross-striations to determine the average daily enamel secretion rate. Deve- 
lopmental times were then assigned to different points in enamel and dentine along 
the enamel-dentine junction. 

Macaques. Data and samples were obtained from two mother-infant dyads and 
two additional juveniles at the California National Primate Research Center, UC 


Davis, California. All subjects were housed in large, intact social groups in outdoor 
corrals (0.2ha). Mothers received a nutritionally complete commercial diet 
(Outdoor Monkey Lab Diet, PMI Nutrition, Intl) twice daily. Subjects were part 
of a larger, on-going study on lactation and infant development*. Three times 
during lactation, at infant age 1, 3-4 and 5-6 months, mothers and infants were 
relocated for milk collection and morphometric measurements as described in 
detail elsewhere*. In the week previous to milk collection, trained technicians con- 
ducted four 10-min focal observations between 8:30 and 12:30 and recorded dura- 
tion of infant suckling behaviour”. All experimental procedures were conducted in 
accordance with ethical guidelines and with UC Davis Institutional Animal Care 
and Use Committee approval. Dentitions were collected opportunistically during 
animal necropsy as part of the CNPRC Biological Specimens Program. 
Neanderthal sample. The Scladina Neanderthal upper first maxillary molar was 
sectioned and temporally mapped for a previous developmental study that esta- 
blished this individual died at approximately 8 years of age’’. 

Ba measurements in teeth using laser ablation-inductively coupled plasma- 
mass spectrometry (LA-ICP-MS). We used a New Wave Research UP-213 laser 
ablation system equipped with a Nd:YAG laser emitting a nanosecond laser pulse 
in the fifth harmonic with a wavelength of 213 nm. The laser was connected to an 
Agilent Technologies 7500cs ICP-MS by Tygon tubing. Details of our analytical 
methods have been published previously”. In brief, the laser beam was rastered 
along the sample surface in a straight line. A laser spot size of 30 um, laser scan 
speed of 60j1ms ' and ICP-MS total integration time of 0.50s produced data 
points that corresponded to a pixel size? of approximately 900 jum’. Reported 
element ratios (Ba/Ca X 10° * and Sr/Ca X 10°) were calculated from concentra- 
tions determined using Ca 88sr and °Ba isotopes and NIST 1486 bone meal asa 
standard. NIST 1486 was not certified for Ba so an average concentration calcu- 
lated from determinations in two other studies**** was used. Diagenetic indicators 
were quantified using NIST 612 glass standard. Each line of ablation produced a 
single data file in comma separated value (.csv) format. Data were processed using 
Interactive Spectral Imaging Data Analysis Software (ISIDAS), a custom-built 
software tool written using Python programming language. ISIDAS reduced all 
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Congenital heart disease (CHD) is the most frequent birth defect, 
affecting 0.8% of live births’. Many cases occur sporadically and 
impair reproductive fitness, suggesting a role for de novo muta- 
tions. Here we compare the incidence of de novo mutations in 362 
severe CHD cases and 264 controls by analysing exome sequencing 
of parent-offspring trios. CHD cases show a significant excess of 
protein-altering de novo mutations in genes expressed in the deve- 
loping heart, with an odds ratio of 7.5 for damaging (premature 
termination, frameshift, splice site) mutations. Similar odds ratios 
are seen across the main classes of severe CHD. We find a marked 
excess of de novo mutations in genes involved in the production, 
removal or reading of histone 3 lysine 4 (H3K4) methylation, 
or ubiquitination of H2BK120, which is required for H3K4 
methylation”*. There are also two de novo mutations in SMAD2, 
which regulates H3K27 methylation in the embryonic left-right 
organizer’. The combination of both activating (H3K4 methyla- 
tion) and inactivating (H3K27 methylation) chromatin marks 
characterizes ‘poised’ promoters and enhancers, which regulate 
expression of key developmental genes®. These findings implicate 
de novo point mutations in several hundreds of genes that collec- 
tively contribute to approximately 10% of severe CHD. 

From more than 5,000 probands enrolled in the Congenital Heart 
Disease Genetic Network Study of the National Heart, Lung, and Blood 
Institute Paediatric Cardiac Genomics Consortium’, we selected 362 
parent-offspring trios comprising a child (proband) with severe CHD 
and no first-degree relative with identified structural heart disease. Pro- 
bands with an established genetic diagnosis were excluded. There were 154 
probands with conotruncal defects, 132 with left ventricular obstruction, 
70 with heterotaxy and six with other diagnoses (Supplementary Table 1). 


Genomic DNA samples from trios underwent exome sequencing® 
(see Methods). Targeted bases in each sample were sequenced a mean of 
107 times by independent reads, with 96.0% read eight or more times. In 
parallel, 264 trios comprising unaffected siblings of autism cases and 
their unaffected parents (Supplementary Table 1) were sequenced in the 
same facility using the same protocol and were analysed as a control 
group’ (Supplementary Table 2 and Supplementary Fig. 1). Family 
relationships were confirmed from sequence data in all trios. 

High-probability de novo variants in probands were identified using 
a Bayesian quality score (QS; see Methods). Sanger sequencing of 181 
putative de novo mutations across the QS spectrum demonstrated 
strong correlation of confirmation with QS (R* = 0.89), with 100% 
confirmation of 90 calls with QS >50 (Supplementary Table 3 and 
Supplementary Fig. 2). Consequently, de novo mutation calls with 
QS = 50 were included in the study; this set is estimated to include 
90% of mutations with QS > 0, with ~100% specificity; 90% of these 
have the maximum QS of 100 (Supplementary Fig. 3). Sensitivity is 
further diminished by ~5% owing to bases with very low read cover- 
age. We found 0.88 de novo mutations per subject in CHD cases and 
0.85 in controls. These mutation rates (1.34 and 1.29 X 10° * per tar- 
geted base) are not significantly different (P = 0.63, binomial test) and 
are similar to previous estimates’®. The set of de novo mutations is 
shown in Supplementary Table 4. 

CHD cases and controls had very similar maternal and paternal 
ages, which had a small effect on the mutation rate (Supplementary 
Fig. 4). We found no significant effect of geographic ancestry on the 
mutation rate (Supplementary Fig. 5). The number of de novo muta- 
tions per subject closely approximated the Poisson distribution, pro- 
viding no evidence for mutation clustering (Supplementary Fig. 6). 
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Table 1 | De novo mutations in genes with high expression in developing heart in CHD probands and controls 


Mutations in genes in top Total no. de novo mutations De novo mutations/subject Odds ratio cases: P valuett 
quartile of expression at E14.5 cont (95% Cl)+ 

CHD 362 trios Controls 264 trios CHD 362 trios Controls 264 trios 
Silent 21 21 0.06 0.08 NA 0.35 
Non-conserved missense 27 iy 0.07 0.06 1.59 (0.67-3.74) 0.76 
Silent and protein changing 102 53 0.28 0.20 NA 0.05 
All protein changing 81 32 0.22 0.12 2.53 (1.22-5.25) 0.003 
Conserved missense 39 i3 0.11 0.05 3.00 (1.25-7.17) 0.01 
Conserved and damaging protein altering 54 15 O15 0.06 3.6 (1.57-8.28) 0.0005 
Damaging 15 2 0.04 0.01 7.50 (1.52-36.95) 0.01 


+The odds ratio is the ratio of protein-altering to silent variants in cases divided by the corresponding ratio in controls. 
++ P values compare the number of variants in each category between cases and controls using a two-tailed binomial exact test. 


Cl, confidence interval; NA, not applicable. 


Genes contributing to CHD should be expressed in the developing 
heart/anlagen or tissues that provide developmental cues. We used 
RNA sequencing of mouse heart at embryonic day (E)14.5 (Methods) 
to partition 16,676 genes with identified human—mouse orthologues 
into the top quartile of expression (4,169 genes with high heart expres- 
sion, HHE; threshold, >40 reads per million mapped reads (r.p.m.)) 
and the bottom 75% (12,507 with lower heart expression, LHE). The 
HHE set included regulatory genes known to be expressed at this stage 
such as Gata4, Nkx2-5 and Tbx5. 

We found a significant increase in the rate of protein-altering de 
novo mutations in HHE genes in patients with CHD compared to 
controls (P=0.003, binomial test, odds ratio= 2.53, Table 1). 
Because it is unlikely that all such de novo mutations alter protein 
function, we enriched for deleterious de novo mutations, first remov- 
ing missense mutations at weakly conserved positions among verte- 
brate orthologues (two or more species with substitutions, median 
seven), then removing missense mutations at highly conserved posi- 
tions (zero or one species with substitution, 72% with zero), leaving 
only damaging mutations (premature termination, splice site and fra- 
meshift). This produced successive increases in the odds ratios to 3.60 
and 7.50, with significant differences between cases and controls in 
each group (Table 1 and Fig. 1a). The rise in odds ratio with increasing 
stringency was significant (P= 0.001, logistic model regression). 
Other predictors of deleterious mutations, such as PolyPhen-2, yielded 
similar results (probably deleterious missense mutations plus dam- 
aging mutations; P = 0.0007, binomial test). Similar results were found 
when genes were partitioned across a range of expression thresholds in 
the developing heart (Supplementary Table 5) and also when analyses 
used heart RNA expression from E9.5 (Supplementary Table 6). By 
contrast, there was no significant difference in mutation frequency in 
CHD cases versus controls among LHE genes, with odds ratios near or 
<1 in all comparisons (Fig. 1a and Supplementary Table 7). Analysis 
comparing the presence or absence of de novo mutations in each case 
and control yielded similar results (Supplementary Table 8 and 
Supplementary Fig. 7). Examination of subjects with left ventricular 
obstruction, conotruncal defects and heterotaxy demonstrated simi- 
larly increased odds ratios for each group (Supplementary Table 9). 

Comparison of de novo mutation frequencies in HHE genes versus 
LHE genes in the CHD cohort also revealed a significantly greater rate 
in HHE genes, again with odds ratios increasing with increasingly 
stringent filters (Fig. 1b and Supplementary Table 7). By contrast, 
controls showed no significant difference in mutation frequencies in 
HHE versus LHE, again with all odds ratios near or <1 (Fig. 1b and 
Supplementary Table 7). 

Notably, examination of genes mutated in the CHD set revealed 
eight involved in the production, removal or reading of methylation of 
H3K4 (H3K4me). Interestingly, three genes in this pathway (MLL2, 
KDM6A, CHD7) have previously been implicated in rare syndromic 
CHD'"”’. In Gene Ontology analysis (http://david.abcc.nciferf.gov/) of 
the 249 protein-altering de novo mutations in CHD probands, the 
H3K4me pathway was the only gene set with significant enrichment 
(P=4 X 10 7, modified Fisher’s exact test, P=4 X 10 * after 


Bonferroni correction; see Methods). The number of mutations in this 
gene set expected by chance was one and controls showed none. 

H3K4meis an activating mark found in promoters/enhancers of key 
developmental genes®. Early in development ‘poised’ promoters/ 
enhancers have both activating H3K4me marks and inactivating 
H3K27me marks; these promoters/enhancers and their target genes 
are selectively activated by modification of these marks in different 
lineages. Mutations in these genes (Table 2 and Fig. 2) included 27% of 
the damaging mutations in the HHE gene set. Mutated genes included 
MLL2 (frameshift) and WDRS5 (missense), components of the MLL2 
H3K4 N-methyltransferase complex’; KDM5A (missense) and 
KDM5B (splice donor), both H3K4 demethylases*; and CHD7 (pre- 
mature termination), an ATP-dependent helicase that binds H3K4me 
sites'*. There were also de novo mutations in RNF20 (premature ter- 
mination) and UBE2B (missense), components of a histone H2BK120 
ubiquitination complex and in USP44 (missense), encoding a histone 
H2B deubiquitinase*. Ubiquitination at H2BK120 is required for 
H3K4 methylation’. 

Interestingly, SMAD2 is mutated twice (splice site, conserved mis- 
sense), a finding unlikely to occur by chance (P = 0.015, Monte Carlo 
simulation) (Table 2). SMAD2 is asymmetrically phosphorylated down- 
stream of NODAL signalling in the embryonic left-right organizer, 
resulting in SMAD2 binding to chromatin, recruitment of JMJD3 and 
demethylation of H3K27me, enabling transcriptional activation at 
poised sites’. Additional genes of note (Table 2) include SUV420H1 
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Figure 1 | Enrichment of nonsynonymous de novo mutations in heart- 
expressed genes. a, Odds ratios, standard errors and P values (two-tailed binomial 
exact test) are shown comparing incidence of classes of de novo mutations in CHD 
cases versus controls for genes in top 25% (red bars) and bottom 75% (blue bars) of 
expression at E14.5 in the developing heart. b, Odds ratios for incidence of 
mutations in genes in top 25% versus bottom 75% of expression in CHD cases (red 
bars) and controls (blue bars). Damaging denotes premature termination, 
frameshift or splice site mutations; conserved MS and noncons. MS denote 
mutations at highly or poorly conserved positions, respectively. NS, not significant. 
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Table 2 | Genes of interest with de novo mutations in probands 


ID Gene Mutation Dx Other structural/neuro/ht-wt 
-00596 9 MLL2+ p.Serl722Arg fs*9 LVO Y/Y/N 
-00853. WDR5t+ p.Lys7GIn CTD /Y/N 
-00534 CHD7+ p.GiIn1599* CTD Y/Y/Y 
-00230 KDMS5A+ p.Arg1508Trp LVO /N/Y 
-01965 KDM5B+ pIVS12+1G>A  LVO /N/Y 
-01907 UBE2B+ p.Arg8Thr CTD N/N/N 
-00075 = RNF20+ p.GIn83* HTX Y/Y/Y 
-01260 USP44+ p.Glu7 1Asp LVO /N/N 
-02020 SMAD2++ p.IVS6 +1 G>A HTX Y/N/N 
-02621 SMAD2++ — p.Trp244Cys HTX Y/NA/N 
-01451 MED20 p.IVS2 + 2T>C HTX /Y/Y 
-01151 SUV420H1_ p.Argl43Cys CTD N/Y/N 
-00750 HUWE1 p.Arg3219Cys LVO /Y/N 
1-00577 = CUL3 p.lsol45Phe fs*23. LVO Y/Y/N 
1-00116 NUB1 p.Asp310His CTD Y/Y/Y 
-01828 DAPK3 p.Prol93Leu CTD N/N/NA 
-03151 SUPT5H p.Glu451Asp LVO  N/NA/ 
-00455 NAAI5 p.Lys336Lys fs*6 HTX Y/Y/N 
-00141 NAAI5 p.Ser761* CTD N/NA/Y 
-01138 USP34 p.Leu432Pro LVO N/NA/ 
-00448 NF1 p.IVS6 +4 del A CTD N/NA/ 
-00802 PTCHI1 p.Arg831GlIn LVO N/NA/ 
-02458 SOSI1 p.Thr266Lys Other Y/Y/Y 
-02952 = PITX2 p.Ala47Val LVO N/NA/ 
-01913 RAB10 p.Asn112Ser Other N/NA/ 
-00638 FBN2 p.Asp2191Asn CTD N/NA/ 
1-00197 BCL9 p.Met1395Lys LVO N/NA/ 
1-02598 LRP2 p.Glu4372Lys HTX  N/NA/ 
Gene symbols are as in NCBI RefSeq database. Other structural/neuro/ht-wt denotes presence (Y) or 
absence (N) of other structural abnormalities, impaired cognitive speech or motor development, and 
height (ht) and/or weight (wt) less than 5th percentile for age, respectively. Further clinical details in 
Supplementary Tables 10 and 11. Associated syndromes: MLL2, Kabuki syndrome; CHD7, CHARGE 


syndrome; CUL3, pseudohypoaldosteronism, type 2E. 

* Premature termination mutation. 

+ Gene involved in production, removal or reading of H3K4 methylation mark. 

++ Gene involved in removal of H3K27 methylation mark. 

Del, deletion; Dx, diagnosis; fs, frameshift mutation; fs*n, frameshift mutation followed by premature 
termination n codons later; NA, data not available. 


(missense), encoding a histone H4 methylase; MED20 (splice site), a 
component of the mediator complex; HUWE1 (missense), a ubiquitin 
ligase targeting histones and TP53; CUL3 (frameshift), a scaffold for 
assembly of many RING ubiquitin ligases*; and NUB1 (missense), which 
inhibits NEDD8, a cofactor for cullin-based ubiquitin ligases. Last, 
NAA1I5, an N-acetyltransferase’’, had two damaging mutations, unlikely 
achance event (P = 0.01, Monte Carlo simulation). Among the 17 above 
genes, ten have no damaging variants and seven have one to five among 
>9,500 exomes in National Heart, Lung, and Blood Exome Sequencing 
Project, 1000 Genomes and Yale exome databases. 

Phenotypes of the eight patients with de novo mutations in the 
H3K4me pathway revealed diverse cardiac phenotypes (Table 2 and 
Supplementary Table 10). Other structural, neurodevelopmental and 
growth abnormalities were common. In addition, consistent with a 
role in left-right axis determination®, both patients with SMAD2 
mutations had dextrocardia with unbalanced complete atrioventricu- 
lar canal and pulmonary stenosis. For other genes mutated more than 
once (for example, NAA15), probands had dissimilar cardiac pheno- 
types (Supplementary Table 11). 

Before initiating exome sequencing, we defined a set of 277 candid- 
ate CHD genes (Supplementary Table 12) from human and model 
system studies. There were 13 CHD probands with de novo mutations 
in these genes (Table 2 and Supplementary Table 13), more than 
expected by chance (P=7 X 10 *, Monte Carlo simulation) or in 
controls (n = 1; P= 0.006, binomial test). This set included several 
genes known to cause Mendelian CHD; however, affected subjects 
lacked cardinal disease manifestations or had atypical cardiac features. 
For example, the patient with the CHD7 mutation had none of the 
main criteria (coloboma, choanal atresia or hypoplastic semicircular 
canals) for CHARGE syndrome”. Similarly, the patient with the MLL2 
mutation was not prospectively diagnosed with Kabuki syndrome; 
however, re-evaluation at age 2 after sequencing identified character- 
istic facial features. Additionally, a patient with an NF1 mutation hada 
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Figure 2 | de novo mutations in the H3K4 and H3K27 methylation 
pathways. Nucleosome with histone octamer and DNA, with H3K4 
methylation bound by CHD7, H3K27 methylation and H2BK120 
ubiquitination is shown. Genes mutated in CHD that affect the production, 
removal and reading of these histone modifications are shown; genes with 
damaging mutations are shown in red, those with missense mutations are shown 
in blue. SMAD2 (2) indicates there are two patients with a mutation in this gene. 
Genes whose products are found together in a complex are enclosed in a box. 


complex conotruncal defect, an unusual finding in neurofibromatosis. 
These findings support variable expressivity and a broader phenotypic 
spectrum resulting from mutations at known disease loci. Other genes 
of interest in this set included RAB10 and BCLY, identified as candi- 
dates by rare de novo copy-number variants™*. 

Our results implicate de novo point/insertion-deletion (indel) 
mutations that by chance occur in genes required for normal heart 
development in the pathogenesis of diverse CHDs. Consistent with 
this inference, genes with damaging and conserved missense muta- 
tions in CHD probands showed higher expression in E14.5 mouse 
heart compared to controls (Supplementary Fig. 8; median 45 versus 
16r.p.m.; P=5 X 10 *, Wilcoxon signed-rank test), whereas expres- 
sion of genes with silent mutations show no significant difference 
(median 21 versus 19 r.p.m.; P= 0.7, Wilcoxon signed-rank test). 
Expression at E9.5 shows similar results (Supplementary Fig. 8). The 
increased mutation burden of HHE genes in cases is not due to a higher 
intrinsic mutation rate of these genes because the rate is significantly 
higher than in controls; moreover, there is no significant difference in 
mutation rate between HHE and LHE genes in controls. Further, par- 
titioning genes into analogous high- and low-expression groups for 
four control adult tissues (brain, heart, liver and lung) showed no 
significant differences in mutation burden between cases and controls 
or between high- and low-expression groups (Supplementary Fig. 9). 

From the increased fraction of patients with protein-altering muta- 
tions in HHE genes in CHD patients (0.22) versus controls (0.12), we 
estimate that such mutations have a role in about 10% of these patients 
(95% confidence interval, 5-15%). This could be somewhat underesti- 
mated, as mutation detection is incomplete, analysis is limited to genes 
with identified mouse orthologues, and the HHE set may not include all 
trait loci. Similarly, the observed odds ratios may be somewhat under- 
estimated as not all mutations in cases are likely to confer risk. 

These findings establish that mutations in many genes in the 
H3K4me-H3K27me pathway disrupt cardiac development and are 
consistent with previous evidence implicating these chromatin marks 
in regulating key developmental genes’, including those involved in 
cardiac development'*’*. Targeted sequencing in larger CHD cohorts 
will enable assessment of the role of each individual gene in this path- 
way. These findings imply dosage sensitivity for these chromatin 
marks in CHD, similar to recent findings implicating haploin- 
sufficiency for chromatin modifying/remodelling genes in diverse 
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cancers'”"*. Investigation of the consequences of these mutations on 
specific enhancers/promoters and the genes they regulate will probably 
provide further insight into the CHD pathogenesis. 

The demonstration that point/indel mutations contribute to ~10% 
of CHD patients and the finding that six genes were mutated twice 
(Supplementary Table 11) enables an estimate of the size of the gene set 
that contributes to these CHDs (see Methods). The point-wise estim- 
ate is 401 genes (95% confidence interval, 197-813), indicating that 
many more CHD-related genes and pathways remain to be discovered. 

Exome sequencing of probands with autism have revealed broadly 
similar results: de novo mutations in a large set of genes occur in a 
significant fraction of patients, with relatively high odds ratios for 
damaging mutations in genes expressed in the brain”’”*’. Most inter- 
estingly, CHD8, which like CHD7 reads H3K4me marks, is frequently 
mutated in autism’, raising the question of whether the H3K4me 
pathway may have a role in many congenital diseases. Among 249 
protein-altering de novo mutations in CHD (Supplementary Table 4) 
and 570 such mutations in autism”’’”°”’, there were two genes, CUL3 
and NCKAP1, with damaging mutations in both CHD and autism and 
none in controls (P = 0.001, Monte Carlo simulation), and several 
others with mutations in both (for example, SUV40H1 and CHD7). 
Similarly, rare copy-number variants at 22q11.2, 1q21 and 16p11 are 
found in patients with autism, CHD or both diseases****. These 
observations suggest variable expressivity of mutations in key deve- 
lopmental genes. Identification of the complete set of these deve- 
lopmental genes and the full spectrum of the resulting phenotypes will 
likely be important for patient care and genetic counselling. 

Our findings do not resolve the pathogenesis of most CHD cases. 
Rare and de novo copy-number variants seem to account for a small 
fraction’*”’; rare or common transmitted variants are also expected to 
make significant contributions. Additionally, considering the role of 
H3K4me and H3K27me marks in promoter/enhancer regulation, 
non-coding mutations cannot be dismissed. Last, evidence of dosage 
sensitivity of many chromatin-modifying genes raises the possibility 
that environmental perturbations of these pathways in critical deve- 
lopmental windows might phenocopy the effects of these mutations. 


METHODS SUMMARY 


De novo mutations in a cohort of 362 probands with CHD and 264 unaffected 
subjects were identified by exome sequencing of parent-offspring trios. Gene 
expression in mouse heart at E14.5 was quantitated by RNA sequencing, and 
genes in the top quartile of expression were identified. The frequency of de novo 
mutations in genes with higher expression in developing heart was compared in 
CHD cases and controls. Enrichment of mutations in particular pathways was 
examined using Gene Ontology. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Patient cohorts. Probands with or without parents were recruited from nine 
centres in the United States and the United Kingdom into the Congenital Heart 
Disease Genetic Network Study of the Paediatric Cardiac Genomics Consortium 
(CHD genes: ClinicalTrials.gov identifier NCT01196182)’. The protocol was 
approved by the Institutional Review Boards of Boston Children’s Hospital, 
Brigham and Women’s Hospital, Great Ormond Street Hospital, Children’s 
Hospital of Los Angeles, Children’s Hospital of Philadelphia, Columbia 
University Medical Center, Icahn School of Medicine at Mount Sinai, Rochester 
School of Medicine and Dentistry, Steven and Alexandra Cohen Children’s 
Medical Center of New York, and Yale School of Medicine. Written informed 
consent was obtained from each participating subject or their parent/guardian. 
Probands were selected for severe CHD (excluding isolated ventricular septal 
defects, atrial septal defects, patent ductus arteriosus or pulmonic stenosis), avail- 
ability of both parents and absence of any CHD in first-degree relatives. Cardiac 
diagnoses were obtained from review of echocardiogram, catheterization and 
operative reports; extracardiac findings were extracted from medical records. 
Controls were from 264 previously studied quartets that included one offspring 
with autism, an unaffected sibling and unaffected parents, all recruited with writ- 
ten informed consent by the Simons Foundation Autism Research Initiative’’. 
Parents and their unaffected sibling from this cohort were analysed in the current 
study. 

Exome sequencing. Trios were sequenced at the Yale Center for Genome Analysis 
following the same protocol. Genomic DNA from venous blood was captured with 
the NimbleGen v2.0 exome capture reagent (Roche) and sequenced (Illumina 
HiSeq 2000, 75 base-paired end reads). Reads were mapped to the reference 
genome using ELANDvz2. Single-nucleotide variants and indel calls were assigned 
a QS using SAMtools* and annotated for novelty using dbSNP, build 135, 1000 
Genomes (May 2011 release) and the Yale Exome Database, for impact on 
encoded proteins and conservation of variant position. 

Identification and confirmation of de novo mutations. Heterozygous single 
nucleotide variants and indels in the proband that showed SAMtools QS = 60 and 
600, respectively, and rare non-reference calls in both parents were selected. Read 
plots of all putative indels were visually inspected in trio members to eliminate 
false calls. A Bayesian algorithm was used to assist de novo mutation calls. 
Elements included probability of the proband being heterozygous at the test posi- 
tion; probability that parents are homozygous for the reference allele, given fre- 
quency of reference and non-reference reads and probability of heterozygosity in 
offspring; probability that a variant is de novo given its population frequency. 
Resulting Bayesian QSs were scaled from 0 to 100. Their correlation with bona 
fide de novo mutations was determined by Sanger sequencing of PCR amplicons 
harbouring 181 putative mutations distributed across the Bayesian QS spectrum. 
Additionally, all six de novo indels with Bayesian QS > 50 in the HHE gene set 
were tested and confirmed by Sanger sequencing. 

RNA sequencing and analysis. Hearts from E14.5 mouse embryos (strain 129/ 
SvEv) were isolated, rinsed and immersed in RNALater. Left and right atria, left 
ventricle (with interventricular septum, aortic and mitral valves) and right vent- 
ricle (with pulmonary and tricuspid valves) were dissected. Chamber-specific 
RNAs were extracted and pooled from five embryos, selected with oligo-dT, 
copied into double-stranded DNA and ligated to adaptors. 150-250 base-pair 
fragments were isolated after acrylamide gel electrophoresis, amplified and 
sequenced (Illumina HiSeq 2000), with >40 million paired-end 50-base reads 
per library as previously described”’. Reads were aligned to the mouse genome 
(mm9)*° and r.p.m. was determined. The average r.p.m. of each gene from each 
chamber was used as the measure of heart expression. RNA from atria, ventricle 
and truncus/outflow tract at E9.5 was prepared, sequenced and analysed by an 
analogous approach. RNA sequencing of control human adult tissues—lung, liver, 
heart and brain—from the Illumina Human Body Map (http://www.ebi.ac.uk/ 
arrayexpress/experiments/E-MTAB-513/?query=illumina+ human + body+map) 
was similarly performed and analysed as r.p.m. per kilobase of transcript. 


Principal component analysis. The EIGENSTRAT program was used to compare 
single-nucleotide polymorphisms (SNPs) genotypes of probands and individuals 
of known ancestry in HapMap3 (http://hapmap.ncbi.nlm.nih.gov/). SNPs with 
minor allele frequency (MAF) >5% without significant linkage disequilibrium 
with other SNPs were analysed. The results of analysis correctly distinguished 
ancestry groups in HapMap3 samples; ancestries of CHD subjects were assigned 
accordingly. 

Statistical analyses. The significance of mutation frequency differences between 
groups was tested with two-tailed binomial exact tests; two-tailed Fisher’s exact 
tests assessed differences in numbers of patients with one or more de novo muta- 
tions; tests among three groups was by Chi-square analysis. Gene expression at 
E14.5 of genes mutated in cases and controls was compared by Wilcoxon signed- 
rank test. Correlation of mutation rate and parental age was tested by Pearson’s 
correlation. The expected number of genes with more than one de novo mutation 
was determined by Monte Carlo simulation (10° iterations) specifying the total 
number of protein-altering mutations and 21,000 genes of observed coding length. 
Analogous approaches were used to determine probabilities of any gene having 
=2 damaging mutations, = 1 damaging and = 1 mutation at a conserved position, 
and = 13 genes mutated in both CHD and autism. The fit to the Poisson distri- 
bution of the observed numbers of de novo mutations per subject was assessed by 
Chi-square test. 

Overrepresentation of de novo mutations in the H3K4me pathway and the 
presence of significant enrichment of other gene pathways was tested by Gene 
Ontology analysis, using a modified Fisher’s exact test with Bonferroni correction 
as implemented in DAVID (http://david.abcc.ncifcrf.gov/). Input was all genes 
with protein-altering de novo mutations in CHD or control subjects, and all genes 
sequenced. The H3K4me gene set was: CHD8, MLL3, SETD7, WHSCIL1, CDC73, 
WHSC1, SETDIA, MLL2, KDMSA, MLL4, MLLS5, UBE2B, ASH1L, SETD1B, MLL, 
LEO1, PAF1, KDMS5C, CTR9, PRDM9, MEN1, CHD7, RNF20, KDM1A, RNF40, 
SMYD3, KDM6A, KDM5B, USP44 and WDRS. The expected number of muta- 
tions in the H3K4me set was calculated from the fraction of the exome-coding 
region attributable to this gene set and the total number of de novo mutations. 
Estimating number of genes in which de novo mutations contribute to CHD. 
We addressed this question using the ‘unseen species problem”. We infer that the 
number of probands with nonsynonymous mutations in the HHE set (81) minus 
the expected number (44; calculated from the number observed in controls) 
represents the number of subjects in whom de novo mutations confer CHD risk 
(37; 10.0% of probands). The number of genes with >1 protein-altering de novo 
mutation (six) minus the most likely number expected by chance (three) repre- 
sents risk-associated genes with more than one mutation (three). The number of 
risk-associated genes (C) is estimated as follows: 


C=clu + ¢xXdx (1—u)/u 


Where c = number of observed risk-associated genes (34), c, = number of genes 
mutated once (31), d = total number of risk-associated mutations (37), g = vari- 
ation in effect size of individual de novo mutations (assumed to be 1, which 
minimizes underestimation of set size), u = 1 - c,/d (probability that newly added 
mutation hits a previously mutated gene). 


C=401. 


From 95% confidence intervals of the number of risk-associated events, the 95% 
confidence interval for number of risk genes is calculated as 197-837. 
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RIP1-driven autoinflammation targets IL-la 
independently of inflammasomes and RIP3 
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The protein-tyrosine phosphatase SHP-1 has critical roles in 
immune signalling, but how mutations in SHP-1 cause inflam- 
matory disease in humans remains poorly defined’. Mice homo- 
zygous for the Tyr208Asn amino acid substitution in the carboxy 
terminus of SHP-1 (referred to as Ptpn6?'" mice) spontaneously 
develop a severe inflammatory syndrome that resembles neutro- 
philic dermatosis in humans and is characterized by persistent 
footpad swelling and suppurative inflammation”’. Here we report 
that receptor-interacting protein 1 (RIP1)-regulated interleukin 
(IL)-la production by haematopoietic cells critically mediates 
chronic inflammatory disease in Ptpn6 mice, whereas inflamma- 
some signalling and IL-1f-mediated events are dispensable. IL-la 
was also crucial for exacerbated inflammatory responses and unre- 
mitting tissue damage upon footpad microabrasion of Ptpn6?” 
mice. Notably, pharmacological and genetic blockade of the kinase 
RIP1 protected against wound-induced inflammation and tissue 
damage in Ptpn6?'" mice, whereas RIP3 deletion failed to do so. 
Moreover, RIP1-mediated inflammatory cytokine production was 
attenuated by NF-KB and ERK inhibition. Together, our results 
indicate that wound-induced tissue damage and chronic inflam- 
mation in Ptpn6" mice are critically dependent on RIP1-mediated 
IL-la production, whereas inflammasome signalling and RIP3- 
mediated necroptosis are dispensable. Thus, we have unravelled a 
novel inflammatory circuit in which RIP1-mediated IL-1a secretion 
in response to deregulated SHP-1 activity triggers an inflammatory 
destructive disease that proceeds independently of inflammasomes 
and programmed necrosis. 

Mutations in the non-receptor protein tyrosine phosphatase Src 
homology region 2 (SH2) domain-containing phosphatase-1 (SHP- 
1) are associated with a spectrum of inflammatory and autoimmune 
diseases in humans*». Similarly, motheaten null and hypomorphic 
alleles of Ptpn6, the gene encoding SHP-1, cause a myeloproliferative 
disease in mice that is characterized by chronic inflammation®”. 
Despite being one of the first in vivo genetic models of inflammatory 
disease, the prevailing mechanism responsible for SHP-1-driven 
inflammation remains to be formally elucidated. Complete character- 
ization of the molecular mechanism responsible for SHP-1-mediated 
disease has been hindered by the fact that motheaten mice are immu- 
nodeficient, develop devastating pneumonitis and glomerulonephritis, 
and die by 2-9 weeks of age’®. Recently, a new SHP-1 mutant mouse 
line (referred to as Ptpn6*®"” mice) that harbours an Tyr208Asn amino 
acid substitution in the C-terminal SH2 domain of SHP-1 was 
described’. Mice homozygous for the hypomorphic spin allele develop 
a chronic inflammatory and autoimmune disease at 8-16 weeks of age 
that presents as persistent swelling and suppurative inflammation of 
the cutaneous footpad tissue (Fig. 1a)*. The popliteal lymph nodes that 
drain the inflamed feet display massive lymphomegaly and are com- 
posed of enhanced numbers of both lymphocytes and myeloid cells 


(Fig. 1b). In contrast, lymph nodes that drain non-inflamed areas in 
Ptpn6®" mice do not display lymphomegaly (Supplementary Fig. 1). 
Diseased Ptpn6'" mice show elevated levels of circulating cytokines 
and chemokines that are associated with granulopoiesis and neutro- 
phil recruitment (Fig. 1c). Consistent with augmented production of 
granulopoietic factors, the inflammatory lesions in the footpads of 
mutant mice are dominated by neutrophils (Fig. 1d), and neutrophilia 
ensues in the periphery (Fig. le and Supplementary Fig. 2). Moreover, 
spontaneous disease progression in Ptpn6?™ mice is characterized by 
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Figure 1 | Ptpn6? mice develop spontaneous footpad inflammation. 

a, b, Spontaneous induction of footpad swelling (a) and lymphomegaly (b) in 
the popliteal lymph nodes (popLN) of Ptpn6?" mice at 10-16 weeks of age. 
b-f, Wild-type (WT) and diseased Ptpn6?"” mice were harvested at 10- 

12 weeks of age. b, Numbers (mean + s.e.m.) of popliteal lymph node cells. 
Inset shows representative pictures of popliteal lymph nodes (original 
magnification, 1). c, Serum levels of cytokines and chemokines. 

d, Immunohistochemistry staining of neutrophils in the footpads (original 
magnification, X20). e, Frequency of splenocytes that are neutrophils. 

f, Production of IL-17 and IFN-y by popliteal lymph node CD4* and CD8* 
T cells following in vitro re-stimulation. Each point represents an individual 
mouse, and the line represents the mean + s.e.m. **P < 0.01, ***P < 0.001. 
DCs, dendritic cells; NK, natural killer. 
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enhanced frequencies of inflammatory T cells that produce high levels 
of IL-17 and IFN-y (Fig. 1f and Supplementary Fig. 3) and the accu- 
mulation of T cells that exhibit an effector/memory phenotype 
(CD44"CDe62L") (Supplementary Fig. 4). Analysis of mice before 
the onset of overt disease (4-8 weeks of age) reveals that Ptpno* PIN mice 
possess normal numbers of lymphoid and myeloid cells (Supplemen- 
tary Fig. 5), and do not display perturbations in T-cell development, 
regulatory T-cell numbers, or T-cell activation status before disease pro- 
gression (Supplementary Fig. 6). Furthermore, the Ptpn6?'" mutation 
does not affect inflammatory cytokine production by peripheral T cells 
and other immune cells in young mutant mice (Supplementary Fig. 7). 

Previous work established that IL-1 receptor (IL-1R) signalling is 
required for Ptpn6*'"-mediated inflammatory disease’. However, the 
molecular mechanisms operating upstream of IL-1R engagement that 
are responsible for spontaneous induction of inflammatory disease are 
not known. Inflammasome-driven activation of caspase 1 is increas- 
ingly recognized as a central instigator of inflammation and disease 
pathology through its critical role in the production of bioactive 
IL-1B". In this context, the NLRP3 inflammasome responds to a mul- 
titude of damage-associated danger signals that are associated with 
autoinflammation’’. To test whether aberrant inflammasome activa- 
tion is responsible for inducing inflammatory disease in response to 
defective SHP-1 signalling, Ptpno’" mice were bred to animals that 
are deficient in the key inflammasome proteins NLRP3 and caspase 1. 
However, homozygous disruption of neither NLRP3 nor caspase 1 
rescued Ptpn6®"" mice from footpad inflammation (Fig. 2a and 
Supplementary Fig. 8) and neutrophil infiltration (Fig. 2b). In full 
agreement, homozygous deletion of the gene encoding IL-1 also 
failed to prevent footpad inflammation and granulocyte recruitment 
in Ptpno?™ mice (Fig. 2a, b and Supplementary Fig. 8), nor was 
excessive inflammatory cytokine production and cutaneous inflam- 
matory disease rescued by genetic deletion of Tlr4 (Supplementary 
Fig. 9). In marked contrast, genetic ablation of I]1a provided significant 


WT Ptpn6""— Ptpn6°""xIl1a~ 


c 
s a 
% 3,940.3] 12.342.5] © 4,540.3 
WT 10+. to 10%. 
10°. Z 109} ow 108. 
102. 102 107. 
0: 0: 0: 
010? 103 104 108 010? 108 104 108 010 10 10* 10° 
5 


108 0,340.1 | 105 0.90.1 | 10° 0.4201 
104. 10%, ¥ 104. 
108. 108, 10°. 
Pm 10°, 10°; 107 
Ptpn6sP'nx 0 0} 0 
Nirp3- 0.10? 10° 10* 10° 010? 10° 10* 10° 010? 10° 10* 105 
~ CD11b = 
d e 
Ptpn6sx = BO ie en 3 = 2.0) ae ne 
Casp1~/- Sa 60 of 1.5 
. ge TR 1.0 g 
Le Bs a 
sx 7 = au 0 = o. 
S 20;,, B= 05) : ed 
bcd *~ cy 
Ptpn6°rin 0 0.0 = 
Ib 80, #8 nee ag 4 ate 
= " 3S 
come zt 8 = 3 3 
geo, F rea F 8 
Ptpn6?"x 2 99 Aan 4 . sf 
Ia’ O Late Ae o ols = 
< aS < RS 
Q SS SX 
fé é Sf é Fy 
rag ge x? ge rag i” 


Figure 2 | Deletion of IL-1 limits Ptpn6’"-mediated disease. a, b, Footpad 
images (a) and neutrophil immunohistochemistry staining (b) of wild-type and 
Ptpnor™ mice that were crossed with mice that are deficient in either NLRP3, 
caspase 1, IL-1, or IL-1a (original magnification, X20). ce, Spleen and 
popliteal lymph nodes from 12-16-week-old wild-type, Ptpno?” and 
Ptpn6?'"X Illa '~ mice. c, Frequencies of neutrophils in the spleen (top panel) 
and popliteal lymph nodes (bottom panel). Numbers in the FACs plots denote 
the mean frequencies + s.e.m. of cells that are neutrophils. d, Total numbers of 
neutrophils in the spleen (top panel) and popliteal lymph nodes (bottom 
panel). e, Production of IL-17 by CD4* T cells following in vitro re-stimulation. 
Data show mean + s.e.m. Each point represents an individual mouse, and the 
line represents the mean + s.e.m. **P < 0.01, ***P < 0.001. 
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protection from the development of footpad inflammatory disease in 
Ptpn6 mutant mice (Fig. 2a, b and Supplementary Fig. 8), which was 
associated with a return to normal neutrophil numbers (Fig. 2b-d), 
and reduced generation of IL-17 producing CD4* helper T (T};17) 
cells (Fig. 2e). These findings demonstrate that IL-1o has a central role 
in SHP-1-mediated disease progression, which proceeds indepen- 
dently of inflammasome activation and IL-1 secretion. 

Given that IL-1« acts as an alarmin that orchestrates wound-healing 
responses'*'°, we next tested whether defective wound healing might 
contribute to disease pathogenesis in Ptpn6" mice. To this end, mice 
were subjected to microabrasion injury on the plantar surfaces of the 
hind feet, and monitored for incidence of inflammatory responses. 
Microabrasion-provoked tissue damage induced similar erythema 
and oedema in wild-type and Ptpn6o'" mice during the first 48h. 
However, inflammation at the wound site was fully resolved in wild- 
type mice by day 14, whereas Ptpno?™ mice developed exacerbated 
inflammation that was characterized by intense redness and swelling 
of the affected area (Fig. 3a, b and Supplementary Fig. 10). At day 21, 
the inability of Ptpn6*’” mice to curtail wound inflammation ulti- 
mately resulted in the development of a persistent and aggravated state 
of footpad inflammation characterized by severe pustular dermatosis 
and oedema (Fig. 3a, b and Supplementary Fig. 10). Notably, genetic 
ablation of IL-1 production in Ptpn6®"” mice provided full protection 
from microabrasion-induced footpad inflammation (Fig. 3a, b and 
Supplementary Fig. 10). The microabrasion procedure triggered a 
rapid (4-5 h after wound induction) and potent production of inflam- 
matory cytokines and chemokines in wild-type mice that was further 
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Figure 3 | Exacerbated wound-healing responses contribute to disease in 
Ptpn6?"" mice. a—c, Microabrasion injuries were induced on the plantar 
surfaces of the footpads of wild-type, Ptpn6?'" and Ptpn6?™X Illa ‘~ mice. 
a, Clinical scores based on erythema and oedema as described in detail in the 
Methods section were recorded daily. b, Percentage of disease-free mice over 
time. c, Serum levels of granulopoiesis-associated factors 5h after 
microabrasion stimulation. d, e, Wild-type (n = 4) and disease-free PTPN6 
mutant mice (5-7 weeks old) (mn = 5) were immunized with MOG/CFA and 
pertussis toxin. d, Mean clinical paralysis scores. e, Splenocytes were collected 
on day 20 and re-stimulated with MOG peptide for 48 h to measure cytokine 
secretion. f, Wild-type (n = 7), disease-free (4-7 weeks of age) Ptpnor™ (n= 9) 
and Ptpn6?x Illa _'~ (n= 5) mice received 250 mgkg ' of acetaminophen 
by intraperitoneal injection. The levels of serum alanine aminotransferase 
(sALT) were measured 18-20h later by ELISA. All bar graphs show 

mean + s.e.m. *P< 0.05, **P< 0.01, ***P< 0.001. 
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exacerbated in Ptpn6"" mice (Fig. 3c). Notably, the enhanced secre- 
tion of neutrophilic factors in Ptpn6'" mice was fully rescued in 
Ptpno™ mice lacking IL-1 (Fig. 3c). Augmented wound-healing 
responses in Ptpn6? mice were not the result of global aberra- 
tions in inflammation because young Ptpn6? mice did not display 
abnormalities in immune cell composition or inflammatory cytokine 
production (Supplementary Figs 5-7). Furthermore, the Ptpno?” 
mutation did not affect the generation of MOG-specific T cells or 
neuroinflammation during experimental autoimmune encephalomye- 
litis (EAE) in young mice (Fig. 3d, e and Supplementary Fig. 11). 
Commensal bacteria are increasingly recognized for their role in the 
pathogenesis of autoimmune diseases'®, and defects in innate immune 
signalling were recently shown to alter the intestinal microbiome’’. 
Because inflammatory disease was suppressed when Ptpno?” mice 
were derived under germ-free conditions’, we explored the possibility 
of footpad-associated dysbiosis in Ptpn6? mice. However, total 
bacterial counts and composition of the footpad-associated micro- 
biome in microabrasion-induced inflammatory skin lesions were com- 
parable in separately housed wild-type and Ptpn6®"” mice, respectively 
(Supplementary Fig. 12). Furthermore, we failed to observe enhanced 
microabrasion-induced granulopoietic cytokine production in wild- 
type mice that were co-housed with Ptpn6™ mice (Supplementary 
Fig. 13), indicating that the Pipno?” mutation alters immune res- 
ponses to normal commensal bacteria rather than modifying the bac- 
terial ecology of inflammatory skin lesions. Notably, Ptpn6? mice 
also were hypersensitive in the acetaminophen (APAP)-induced liver 
injury model that is considered a model for sterile autoinflammation 
and wound-healing responses'*. IL-1a deletion provided significant 
protection from APAP-induced liver injury as evidenced by markedly 
reduced serum alanine aminotransferase levels in APAP-challenged 
Ptpno?'"xIlla_'~ mice (Fig. 3f). Together, these results indicate a 
critical role for IL-1o in both sterile and commensal-associated inflam- 
matory and wound-healing responses of Ptpn6” mice. 

To determine whether SHP-1 regulates inflammatory responses in 
haematopoietic or radioresistant cells, bone marrow chimaera mice 
were generated. Expression of the hypomorphic Ptpn6*?™ allele in the 
haematopoietic compartment alone promoted spontaneous footpad 
inflammation (Fig. 4a) concomitant with augmented cytokine produc- 
tion (Fig. 4b) and neutrophilia (Supplementary Fig. 14). In contrast, 
chimaeric mice bearing the Ptpn6*?” mutation only in radioresistant 
cells failed to develop footpad inflammation (data not shown), suggest- 
ing that SHP-1 expression in bone-marrow-derived immune cells 
rather than in non-haematopoietic cells (such as keratinocytes) is criti- 
cal for induction of the autoinflammatory syndrome. Collectively, 
these findings suggest that unwarranted IL-1« release in response to 
dysregulated SHP-1 activity in haematopoietic cells has a pivotal role in 
the induction of inflammatory disease. To identify the bone-marrow- 
derived cell populations that are responsible for Ptpn6*?'"-induced 
inflammation, we investigated inflammatory responses in isolated 
macrophages and neutrophils as these cell types have been shown to 
centrally regulate inflammatory and wound-healing responses’. The 
Ptpn6" mutation did not influence inflammatory cytokine produc- 
tion in macrophages (Supplementary Fig. 15). Although Ptpno?” 
neutrophils produced slightly higher levels of the proinflammatory 
cytokines granulocyte colony-stimulating factor (G-CSF) and tumour- 
necrosis factor- (TNF-«) in response to lipopolysaccharide (LPS) 
stimulation, production of other pro-inflammatory mediators (KC, 
IL-6 and IL-1) was normal in these cells (Supplementary Fig. 16). 
We therefore concluded that modest differences in neutrophil- 
associated cytokine production may contribute to, but are unlikely to 
account fully for, the marked inflammatory phenotype observed in vivo. 

The kinase RIP1 is emerging as a key regulator of inflammatory 
cytokine production and cellular stress*’*. To address the in vivo role 
of RIP1 in exacerbated inflammatory cytokine production, Ptpn6?” 
mice were pre-treated with either vehicle control (PBS), the RIP1 
kinase inhibitor necrostatin 1 (Nec1) or the structurally related inactive 
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Nec] analogue (iNec)” before being subjected to microabrasion injury. 
Unlike iNec, Necl-mediated in vivo inhibition of RIP1 kinase activity 
markedly attenuated secretion of inflammatory mediators in Ptpno?™” 
mice to levels comparable to those of wild-type mice (Fig. 4c), suggest- 
ing a critical role for RIP1 signalling in Ptpno?'"-induced inflam- 
matory disease. RIP1-deficient mice suffer from perinatal lethality”’, 
hampering genetic analysis of the role of RIP1 in Ptpn6?’"-induced 
autoinflammation. However, the observation that Ptpno? ™ mediated 
autoinflammation stems from the haematopoietic compartment 
(Fig. 4a, b) provided a rationale to explore the role of RIP1 by means 
of fetal liver transplantation experiments. To this end, fetal liver cells 
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Figure 4 | RIP1 regulates Ptpn6"”-mediated disease through the control of 
proinflammatory signalling and I/1a expression and not via RIP3-induced 
necroptosis. a, Spontaneous incidence of footpad inflammation in bone 
marrow chimaeric mice (donor>recipient). b, Levels of circulating 
granulopoiesis-associated factors in bone marrow chimaeras. c, Wild-type mice 
were pre-treated with vehicle control (n = 27) and Ptpn6*?'" mice were pre- 
treated with vehicle control (m = 22), 50 tg necrostatin 1 (Nec1) (n = 33), or 
50 ig of an inactive control analogue of Necl (iNec) (m = 10) for 1h before 
microabrasion injury induction. Serum levels of granulopoiesis-inducing 
factors 4h after wound induction. d, e, Spontaneous incidence of footpad 
inflammation (d) and numbers of peripheral blood neutrophils (e) in wild-type 
(Ptpn6 "x Rip1*'* >WT), Ptpn6?'""xRip1*’* (Ptpn6o?XxRip1*’* >WT) 
and Ptpn6?'""x Rip! ‘~ (Ptpno?"x Rip1'~ >WT) fetal liver transplant mice. 
f, g, Wild-type, Ptpnor” and Ptpno? x Ila !— mice were pre-treated with 
PBS or 50 pg Necl 1h before microabrasion injury induction. Regulation of 
ERK and NF-kB signalling (f) and Jl1a expression (g) in the footpads 2h after 
wound induction. h, Microabrasion injuries were induced on the plantar 
surfaces of the footpads of wild-type, Ptpn6?” and Ptpn6?'"xRip3 '~ mice. 
Percentages of disease-free mice over time. Data show mean + s.e.m. of a 
representative experiment. *P < 0.05, **P<0.01, ***P < 0.001. 
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from Ptpn6’'Rip1*'*, Ptpno?Rip1*’* and Ptpno?Ripl /~ 
embryos collected at embryonic (E) day E14.5 were transferred into 
irradiated CD45.1 congenic mice. Reconstitution of recipient mice with 
control Ptpn6?Rip1"'* fetal liver cells resulted in unremitting foot- 
pad swelling, whereas genetic deletion of Rip1 in the haematopoietic 
compartment provided protection against Ptpn6*?'"-associated inflam- 
matory disease progression (Fig. 4d and Supplementary Fig. 17) and 
neutrophilia (Fig. 4e), a hallmark of this inflammatory syndrome. We 
proposed that targeted MAP kinase and NF-«B signalling drives 
Ptpn6?""-associated inflammation. In agreement, we found that in vivo 
RIP1 inhibition markedly dampened local activation of ERK and NF- 
«B signalling (Fig. 4f). Moreover, pharmacological blockade of NF-KB 
activation with the IKK-f inhibitor SC-514 and inhibition of ERK 
signalling with U0126 treatment both abrogated hyperinflammatory 
cytokine production in Ptpn6?'" mice (Supplementary Fig. 18). 
Importantly, the RIP1 kinase inhibitor Nec] also inhibited the synthesis 
of Illa transcripts (Fig. 4g), further highlighting the role of RIP1 as a 
critical regulator of NF-«B-induced and IL-1o-driven autoinflamma- 
tion in Ptpno®” mice. Notably, IL-1o. deletion also attenuated exacer- 
bated ERK and NF-kB signalling in the footpads of Ptpn6*P” mice 
(Fig. 4f), suggesting that RIP1-mediated IL-1a production triggers an 
inflammatory feedback loop that contributes to disease progression. In 
addition to driving MAP kinase and NF-«B activation, RIP1 controls 
induction of necroptosis in conjunction with RIP3 (ref. 20). To verify a 
potential role for unwarranted necroptosis induction in Ptpn6?'”- 
associated inflammatory disease, Rip3-deficient mice were bred to 
Ptpno*P mice. However, unlike deletion of Rip1 and Illa, genetic abla- 
tion of Rip3 expression failed to protect Ptpno'" mice from exacer- 
bated inflammation in response to microabrasion-induced tissue injury 
(Fig. 4h and Supplementary Fig. 19). These results indicate that RIP3- 
mediated necroptosis is dispensable, and suggest a critical role for RIP 1- 
mediated regulation of MAP kinase and NF-kB signalling in driving the 
inflammatory phenotype of Ptpn6®"" mice. 

Defective neutrophil homeostasis is associated with numerous devas- 
tating human diseases”. For instance, neutropenia can cause severe 
susceptibility to infection, whereas neutrophilia is linked to autoinflam- 
matory disorders. Our results in the Pipn6®” mouse inflammation model 
highlight a critical role for RIP1-mediated ERK and NF-«B signalling in 
haematopoietic cells in driving an inflammatory circuit that triggers 
excessive inflammatory responses and persistent tissue damage. Indeed, 
biochemical and genetic blockade of RIP1 signalling prevented inflam- 
matory cytokine production and protected Ptpn6? mice from autoin- 
flammation. IL-1« was critical for RIP1-mediated inflammatory disease 
progression, which proceeded independently of inflammasome/caspase- 
1-produced IL-1B and RIP3-mediated necroptosis. Consequently, thera- 
peutic inhibition of RIP1 activity and/or neutralization of IL-la may 
provide novel approaches to break the self-reinforcing inflammatory 
circuits that drive chronic autoinflammatory and autoimmune diseases. 


METHODS SUMMARY 


Ptpn6®'" mice homozygous for the Tyr208Asn amino acid substitution in the 
C-terminal Src homology 2 domain of SHP-1 have been described previously’. 
Ptpn6?" mice spontaneously develop a persistent footpad disease that is charac- 
terized by paw swelling and cutaneous inflammation at 8-16 weeks of age. Blood 
was collected by submandibular venipuncture to measure the levels of circulating 
neutrophils and inflammatory cytokines. To assess T-cell-mediated cytokine pro- 
duction, splenocytes and popliteal lymph node cells were re-stimulated with PMA/ 
ionomycin followed by intracellular flow cytometry staining. Formalin-preserved 
footpad samples were embedded in paraffin. Footpad infiltration by inflammatory 
cells and neutrophils was assessed in a blinded manner by a pathologist using 
haematoxylin and eosin staining and neutrophil immunohistochemistry. The 
accumulation of immune cells in lymphoid organs was evaluated with the use 
of flow cytometry staining. To provoke microabrasion injury, mice were anaes- 
thetized and the plantar surfaces of their hind paws were irritated by gently 
rubbing with sterile sandpaper. Clinical scores were assigned based on oedema, 
erythema and weepy wound formation. The development of persistent footpad 
swelling was used to evaluate disease incidence over time. EAE was induced using 
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MOG peptide, CFA, heat-inactivated Mycobacterium tuberculosis and Bordetella 
pertussis toxin. Neutrophils were purified from the bone marrow and stimulated with 
LPS. Fetal liver cells collected at embryonic day E14.5 were transplanted into lethally 
irradiated wild-type mice to generate Ptpn6?" Rip1_/~ >wild-type chimaeric mice. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Mice. Ptpno?"” (ref. 3), Nirp3/~ (ref. 24), Casp1/~ (ref. 24), I11b~'~ (ref. 25), 
Illa ‘~ (ref. 26), Rips /~ (ref. 27) and Ripl/~ (ref. 21) mice were previously 
described. All mice were housed under specific pathogen-free conditions within 
the Animal Resource Center at St Jude Children’s Research Hospital. Animal 
studies were conducted under protocols approved by the Institutional Animal 
Care and Use Committee of St Jude Children’s Research Hospital. 
Histopathology. Formalin-preserved feet were processed and embedded in par- 
affin according to standard procedures. Sections (5 jtm) were stained with haema- 
toxylin and eosin (H&E) and examined by a pathologist blinded to the experimental 
groups. For immunohistochemistry, formalin-fixed paraffin-embedded tissues 
were cut into 4 um sections and slides were stained with anti-Gr-1 to stain neu- 
trophils in the footpads. 

Microabrasion injury model. A novel microabrasion wound model was 
developed to evaluate the inflammatory response in a synchronized and controlled 
fashion. In this model, wild-type and disease-free Ptpn6?'" mice (4-8 weeks old) 
were anaesthetized and the plantar surfaces of the hind paws were irritated by 
gently rubbing with sterile sandpaper to induce physical trauma and microinjuries. 
Clinical scores were assigned daily based on the following scale: 0, no disease; 1, 
erythema; 2, erythema and mild swelling; 3, erythema, swelling and crusty wound 
formation; 4, weepy wound formation and severe swelling. The development of 
persistent footpad swelling was used to evaluate disease incidence over time. The 
levels of proinflammatory mediators that are produced in response to microabra- 
sion injury were measured in the serum 4-5 h after wound induction. For the in 
vivo necrostatin 1 experiments, mice were given either 50 jig necrostatin 1 (Nec1) 
(Sigma-Aldrich) or 50 ug of an inactive control analogue (iNec, Calbiochem) by 
the intraperitoneal route 1h before microabrasion irritation of the footpads. 
Circulating cytokine levels were measured in the serum 4-5 h later. 
Experimental autoimmune encephalomyelitis (EAE). Age- (5-8 weeks) and 
sex-matched mice were immunized subcutaneously with 100 tig MOG35_55 pep- 
tide (MEVGWYRSPFSRVVHLYRNGK) emulsified in CFA (Difco Laboratories) 
with 500 ug Mycobacterium tuberculosis on day 0. Mice also received 200 ng 
pertussis toxin (List Biological Laboratories) by intraperitoneal injection on days 
0 and 2. Disease severity was assessed daily by assigning clinical scores according to 
the following scale: 0, no disease; 1, tail paralysis; 2, weakness of hindlimbs; 3, 
paralysis of hindlimbs; 4, paralysis of hindlimbs and severe hunched posture; 5, 
moribund or death. To collect CNS leukocytes, mice were perfused through the left 
ventricle with PBS. The spinal cord was isolated, cut into small pieces, and then 
passed through a 70 kum cell strainer. Leukocytes were then purified by gradient 
centrifugation using a 38% Percoll solution. Cells were washed once in PBS and 
then re-suspended in media. 

In vivo serum cytokines. Blood was collected by submandibular venipuncture 
and allowed to clot for 30-60 min at room temperature. Serum was collected after 
centrifugation and cytokines were measured by ELISA. 

ELISA. Cytokine ELISA was performed according to manufacturer’s instructions 
(Millipore). 

Flow cytometry and antibodies. The following monoclonal antibodies were used 
for flow cytometric cell marker analysis: CD4 (L3T4), IFN-y (XMGI1.2), IL-17A 
(eBiol7B7), MHCII (M5/114.15.2), CD11b (M1/70), CD19 (6D5), CD44 (IM7), 
Ly-6G (1A8), CD25 (3C7), B220 (RA3-6B2) and Gr-1 (RB6-8C5) from 
eBioscience and TCR-f (H57-597), CD8 (53-6.7), Foxp3 (FJK-16s), CD62L 
(MEL-14), TNF-« (MP6-XT22), CD11lc (N418), CD45.1 (A20) and CD45.2 
(104) from Biolegend. Intracellular cytokine staining was done using the 
eBioscience IC fixation/permeabilization kit according to the manufacturer’s pro- 
tocol. Intracellular staining for the Foxp3 transcription factor was performed using 
the eBioscience Foxp3 staining set according to the manufacturer’s recommenda- 
tions. Flow cytometry data were acquired on an upgraded five-colour FACScan or 
multi-colour LSRII (BD) and were analysed with FlowJo software (TreeStar). 

Ex vivo lymphocyte re-stimulation. Splenocytes and lymph node (popliteal 
and mesenteric) cells were collected and re-stimulated with 20ngml~! phorbol 
12-myristate 13-acetate (PMA) and 500ngml ‘' ionomycin in the presence of 
monensin for 3-4 h. Cells were stained according to the manufacturer’s instruc- 
tions (eBioscience). For the EAE experiment, splenocytes were harvested and re- 
stimulated with 30 jig ml’ MOG peptide. Supernatants were collected after 48h 
to measure cytokine levels by ELISA. 


Bone marrow chimaeras. Bone marrow was flushed from the femurs and filtered 
through a 40 jum filter. 3-5 10° cells in 200 il PBS were transferred by tail vein 
injection into lethally irradiated (1,000 rad) mice. Congenic CD45 markers were 
used to verify chimaerism. 

In vitro macrophage stimulation. Bone-marrow-derived macrophages 
(BMDMs) were generated by culturing bone marrow cells in L-cell-conditioned 
IMDM medium supplemented with 10% FBS, 1% non-essential amino acid, and 
1% penicillin-streptomycin for 5 days. BMDMs were seeded in 12-well cell culture 
plates and cultured overnight. To evaluate cytokine production, BMDMs were 
primed with 2 1g ml~' ultrapure Escherichia coli-derived LPS (Invivogen) for 3h 
followed by 5mM ATP (Sigma-Aldrich) for an additional 30 min. BMDMs were 
also separately stimulated with Salmonella enterica serovar Typhimurium (5 
MOI) for 4h and supernatants were collected to evaluate cytokine secretion by 
ELISA. 

Neutrophil culture and in vitro stimulation. Bone marrow cells were isolated 
from the femurs of mice and neutrophils (CD11b* Gr-1*) were purified by flow 
cytometry sorting. Neutrophils (1 X 10° cells ml” ') were stimulated with 100 ng 
ml | ultrapure Escherichia coli-derived LPS (Invivogen). Supernatants were col- 
lected after 48 h of stimulation and cytokine levels were measured by ELISA. 
Western blotting. Footpad protein lysates were collected in RIPA lysis buffer 
supplemented with complete protease inhibitor cocktail (Roche) and PhosSTOP 
(Roche) using a tissue homogenizer. Samples were resolved by SDS-PAGE and 
transferred to polyvinylidene difluoride (PVDF) membranes via electroblotting. 
Membranes were blocked in 5% non-fat milk and incubated overnight at 4°C 
with primary antibodies. The membranes were then probed with horseradish 
peroxidase (HRP)-tagged secondary antibodies at room temperature for 1h. 
Immunoreactive proteins were visualized using the ECL method (Pierce). 
Real-time RT-PCR. Total RNA was isolated from the hind paws with Trizol 
(Invitrogen) according to the manufacturer’s instructions. 1 jig of RNA was 
reverse-transcribed to cDNA with random RNA-specific primers using the 
high-capacity cDNA reverse transcription kit (Applied Biosystems). Transcript 
levels of [11a and Gapdh were analysed using SYBR-Green (Applied Biosystems) 
on an ABI7500 real-time PCR machine according to the manufacturers’ recom- 
mendations. Relative expression was calculated using the AACt standardization 
method. 

Footpad pathology scoring. Footpad haematoxylin and eosin sections were 
scored based on the extent and severity of inflammation, ulceration and hyper- 
plasia of the mucosa in a blinded fashion by a veterinary pathologist. Severity 
scores for inflammation were as follows: 0, normal (within normal limits); 1, 
minimal (small, focal, or widely separated); 2, mild; 3, moderate (moderate multi- 
focal inflammation with dermatitis, suppurative, coalescing with intraepithelial 
and follicular abscesses); 4, marked (marked inflammation, with intraepidermal 
pustules, epidermal hyperplasia, acantholysis, dermatitis, perifolliculitis); 5, severe 
(severe inflammation, with intraepidermal pustules, epidermal hyperplasia, 
acantholysis, dermatitis, perifolliculitis, lesions covering >50% of the section). 
APAP-induced hepatotoxicity model. Acetaminophen (Sigma-Aldrich) was dis- 
solved in sterile PBS by heating the solution to 55 °C. Mice that fasted overnight for 
16-18 h received 250 mg kg ' of acetaminophen (APAP) by intraperitoneal injec- 
tion. Mice were harvested 18-20 h after injection and the levels of serum alanine 
aminotransferase (SALT) were measured in the blood by ELISA. 

Statistical analysis. All results are presented as means + standard errors. We 
performed statistical analysis using the two-tailed Student’s t-test. Differences 
were considered statistically significant when P < 0.05. P values are denoted by 
*P<0.05, **P < 0.01, ***P < 0.001. 
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Plasmodium virulence 
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Defining mechanisms by which Plasmodium virulence is regulated 
is central to understanding the pathogenesis of human malaria. 
Serial blood passage of Plasmodium through rodents’ *, primates* 
or humans’ increases parasite virulence, suggesting that vector 
transmission regulates Plasmodium virulence within the mam- 
malian host. In agreement, disease severity can be modified by 
vector transmission® *, which is assumed to ‘reset’? Plasmodium 
to its original character’. However, direct evidence that vector 
transmission regulates Plasmodium virulence is lacking. Here we 
use mosquito transmission of serially blood passaged (SBP) Plas- 
modium chabaudi chabaud? to interrogate regulation of parasite 
virulence. Analysis of SBP P. c. chabaudi before and after mosquito 
transmission demonstrates that vector transmission intrinsically 
modifies the asexual blood-stage parasite, which in turn modifies 
the elicited mammalian immune response, which in turn attenu- 
ates parasite growth and associated pathology. Attenuated parasite 
virulence associates with modified expression of the pir multi-gene 
family. Vector transmission of Plasmodium therefore regulates 
gene expression of probable variant antigens in the erythrocytic 
cycle, modifies the elicited mammalian immune response, and thus 
regulates parasite virulence. These results place the mosquito at the 
centre of our efforts to dissect mechanisms of protective immunity 
to malaria for the development of an effective vaccine. 

The definitive host for mammalian Plasmodium is the anopheline 
mosquito. Within this vector, a complex series of developmental 
events, including fertilization and meiosis, culminates in invasion of 
the salivary glands by infective sporozoites, which are transmitted to 
the mammalian host through mosquito bite. Sporozoites deposited 
in the dermis migrate to the liver, invade hepatocytes and undergo 
further developmental processes before the release of merozoites that 
invade erythrocytes. The subsequent erythrocytic cycle is entirely 
responsible for the morbidity and mortality associated with malaria. 
The complexity of the Plasmodium life cycle has led to much of the 
basic biology of the blood-stage infection being studied in isolation, 
with in vivo experiments largely initiated through direct injection of 
infected erythrocytes. However, serial blood passage of Plasmodium 
increases parasite virulence’*, suggesting that regulation of Plas- 
modium virulence is an inherent consequence of vector transmission’. 
This could result indirectly from vector control of inoculum size or the 
passage of large parasite populations through extreme bottlenecks, 
although these consequences of mosquito transmission are not 
thought to be major determinants of disease severity*’®. Alterna- 
tively, vector transmission may regulate Plasmodium virulence by 
intrinsically modifying the parasite and its interaction with the 
mammalian host. In this context, the immune response elicited by 
Plasmodium influences disease severity'', and can therefore dictate 
parasite virulence. The interrelationship between the vector, parasite 
and mammalian immune system could thus underpin the pathogen- 
esis of malaria. 

To study regulation of Plasmodium virulence we developed routine 
mosquito transmission of SBP P. c. chabaudi’, a rodent malaria parasite 


that has many characteristics associated with the pathogenesis of 
human infection’. This allowed us to directly compare SBP parasites 
before and after vector transmission. Accordingly, mice were infected 
with SBP P.c. chabaudi AS either by injection of parasitized erythro- 
cytes (pE) or mosquito bite (see Methods). Following mosquito trans- 
mission, asexual blood-stage parasite growth was attenuated (Fig. 1a), 
and a low-grade, recrudescing infection with extended chronicity was 
established (Supplementary Fig. 1). Attenuated parasite growth in the 
erythrocytic cycle was not influenced by dose (ref. 9 and Supplementary 
Fig. 2) or, importantly, by the pre-erythrocytic stages of infection, as 
attenuated parasite growth was similarly observed when mice were 
injected with pE derived from recently mosquito-transmitted (MT) 
parasite lines (Fig. 1b). Similar results were observed with cloned para- 
sites derived from SBP P. c. chabaudi AS (Supplementary Fig. 3), and 
with the hypervirulent P.c. chabaudi CB (Supplementary Fig. 4). 
Mosquito transmission therefore attenuated the asexual blood-stage 
parasite. As expected, serial blood passage of MT P.c. chabaudi AS 
rapidly increased parasite growth (Supplementary Fig. 5). Mice infected 
with P. c. chabaudi AS through mosquito bite did not show the severe 
hypothermia, cachexia or hepatic cellular damage that was observed 
during the acute phase of infection in mice injected with SBP parasites, 
although they still showed severe anaemia despite attenuated parasite 
growth (Fig. 1c-f). Mosquito transmission therefore reduced disease 
severity in the mammalian host. Despite attenuated parasite growth and 
reduced pathogenicity, MT P. c. chabaudi AS elicited robust, long-term 
protection to reinfection with homologous or heterologous blood-stage 
parasites (Fig. 1g and Supplementary Fig. 6). Thus, vector transmission 
regulates the virulence of Plasmodium by intrinsically modifying the 
asexual blood-stage parasite, without influencing the capacity of the 
mammalian host to acquire robust immunity to reinfection. 

The pathogenesis of malaria is complex and influenced by the mam- 
malian immune system; dysregulated immune reactions can directly 
promote severe disease'', whereas an appropriate response can enhance 
parasite clearance without promoting pathology’’. The immune res- 
ponse induced by Plasmodium can therefore define its virulence. 
Throughout the erythrocytic cycle the spleen is the major anatomical 
site associated with the developing immune response’, and mice 
infected with P.c.chabaudi AS through mosquito bite developed 
marked splenomegaly with rapid recruitment of inflammatory mono- 
cytes (Supplementary Figs 7 and 8). Importantly, following mosquito 
transmission there was enhanced expansion of activated CD8«* and 
CD8a~ dendritic cells, which present malaria-specific antigens and 
stimulate CD4* T-cell proliferation’, in the acute phase of infection 
(Fig. 2a and Supplementary Fig. 9). Correspondingly, the magnitude of 
the effector CD4* T-cell response, which orchestrates innate and 
adaptive immune control of blood-stage parasite growth’’, was also 
enhanced following mosquito transmission, and the memory CD4* 
T-cell population showed a predominantly effector memory phenotype 
(Fig. 2b and Supplementary Fig. 10). Infection with MT P. c. chabaudi 
AS also increased the magnitude of the class-switched malaria-specific 
antibody response, a central component of erythrocytic immunity" 
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Figure 1 | Mosquito transmission of P. c. chabaudi AS attenuates virulence. 
a, Parasitaemia of C57BL/6 mice injected with 10° SBP P. c. chabaudi AS (Pcc 
AS) or infected with Pcc AS through mosquito bite. b, Parasitaemia of C57BL/6 
mice injected with 10° SBP Pcc AS or injected with 10* or 10° pE derived from 
one of four recently MT lines of Pcc AS. c—e, Temperature (c), weight (d) and 
erythrocyte count (e) of C57BL/6 mice injected with 10° SBP Pcc AS or infected 
with Pcc AS through mosquito bite. f, Liver enzyme concentration on day 10 
post-infection in plasma of C57BL/6 mice injected with 10° SBP Pcc AS or 
infected with Pcc AS through mosquito bite. Data presented as fold-change 
relative to uninfected control mice. ALB, albumin; ALP, alkaline phosphatase; 
ALT, alanine aminotransferase; BA, bile acids; BUN, blood urea nitrogen; 
CHO, cholesterol; TBI, total bilirubin. g, Parasitaemia of C57BL/6 mice injected 
with 10° SBP P. c. chabaudi CB (Pcc CB) as a first infection (open symbols), or 
asa rechallenge (closed symbols) 90 days after injection with 10° SBP Pcc AS or 
infection with Pcc AS or CB through mosquito bite. (n = 3-20 mice per group; 
data presented as mean with s.e.m.). 


(Fig. 2c, d). Mosquito transmission therefore enhanced antibody pro- 
duction in the chronic phase of infection, subsequent to enhanced 
innate and adaptive cellular responses early in infection. Conversely, 
mosquito transmission attenuated systemic inflammation during the 
acute phase response, with decreased circulating levels of pro-inflam- 
matory cytokines and chemokines associated with severe disease'''° 
(Fig. 2e). Thus, vector transmission intrinsically modifies the asexual 
blood-stage parasite and transforms the mammalian immune response 
elicited during the erythrocytic cycle. 
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Figure 2 | Mosquito transmission of P. c. chabaudi AS transforms the 
elicited mammalian immune response. a,b, Number of CD8«_ (open bars) 
and CD8«* (closed bars) dendritic cells (a), and effector (Tg) (open bars) and 
memory (Ty) (closed bars) CD4* T cells (b) in spleens of C57BL/6 mice 
injected with 10° SBP Pcc AS or infected with Pec AS through mosquito bite. 
c, d, Plasma concentration of total parasite-specific IgG throughout infection 
(c) and parasite-specific IgG subclasses on day 80 post-infection (d) in C57BL/6 
mice injected with 10° SBP Pcc AS or infected with Pcc AS through mosquito 
bite. Data presented as arbitrary units (AU) relative to hyper-immune plasma. 
e, Plasma cytokine concentration in C57BL/6 mice injected with 10° SBP Pcc 
AS or infected with Pcc AS through mosquito bite. (1 = 3-5 mice per group per 
time-point; data presented as mean with s.e.m.). 


Parasite growth and pathogenicity are, in part, determined by host 
susceptibility. Infection of susceptible mouse strains with P. c. chabaudi 
AS through mosquito bite causes severe disease and death’, demon- 
strating that vector transmission does not limit the potential viru- 
lence of the asexual blood-stage parasite. We therefore addressed 
whether attenuated parasite virulence in C57BL/6 mice infected with 
MT P.c. chabaudi AS was a consequence of the transformed host 
immune response. Immunodeficient mice were injected with SBP 
P.c. chabaudi AS, or with an equivalent number of pE derived from a 
recently MT line. Disruption of the innate and adaptive immune res- 
ponses, through depletion of CD4* T cells, or depletion of the entire 
adaptive arm of the immune system led to a virulent and fatal acute 
phase infection (Fig. 3). Attenuation of parasite virulence by mosquito 
transmission was therefore dependent upon an intact mammalian 
immune response and, furthermore, independent of parasite growth 
rate (Supplementary Fig. 11). Immune control of parasite virulence 
therefore resulted directly and exclusively from modified innate and 
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Figure 3 | Transformed innate and adaptive immune responses attenuate 
P. c. chabaudi AS virulence. a-c, Parasitaemia of wild-type (a), CD4* T-cell 
deficient (MHC II KO) (b), and B- and T-cell deficient (RAG KO) (c) C57BL/6 
mice injected with 10° SBP Pcc AS or injected with 10° pE derived from a 
recently MT line of Pcc AS. (n = 4-6 mice per group; data presented as mean 
with s.e.m.). 


adaptive immune responses elicited by, and directed against, the blood- 
stage parasite. Vector transmission of Plasmodium thus intrinsically 
modifies the asexual blood-stage parasite, which in turn modifies the 
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elicited mammalian immune response, which in turn regulates parasite 
virulence. 

Defining parasite gene expression in the erythrocytic cycle after 
vector transmission is thus central to understanding the pathogenesis 
of malaria. We therefore performed genome-wide RNA sequencing on 
P.c. chabaudi AS, directly comparing blood-stage parasites before and 
after mosquito transmission. This allowed us to identify a set of 
Plasmodium virulence genes that direct the elicited mammalian 
immune response (Fig. 4). Vector transmission modified expression 
of approximately 10% of the entire genome in the late trophozoite 
stage parasite (Supplementary Tables). The majority of genes upregu- 
lated following mosquito transmission encoded exported proteins, 
with the potential to access and modulate the mammalian immune 
system. Importantly, parasite gene expression was most intensely 
regulated within the sub-telomeric large multi-gene families, with 
preferential regulation of the pir multi-gene family (termed cir in 
P.c. chabaudi) (Supplementary Fig. 12). Out of 200 cir genes, 123 
(61.5%) were differentially expressed following mosquito transmis- 
sion, with 114 cir genes (57%) upregulated. Furthermore, the most 
upregulated gene following serial blood passage was identified as the 
most highly expressed cir gene (PCHAS_110030) in mice infected with 
SBP P. c. chabaudi AS (ref. 17, Fig. 4 and Supplementary Fig. 12). Serial 
blood passage therefore selected for dominant cir gene expression, 
whereas mosquito transmission revoked the selected expression hier- 
archy and promoted a generalized increase in cir expression across the 
parasite population. We therefore uncover a direct association between 
pir gene expression and Plasmodium virulence, and demonstrate that 
vector transmission regulates expression of probable antigenic variants", 
as proposed previously’®”’. Vector transmission of Plasmodium thus 
regulates parasite gene expression in the erythrocytic cycle and, conse- 
quently, regulates immune control of Plasmodium virulence. 

Vector transmission will inherently regulate Plasmodium virulence 
within the mammalian host. Recombination of distinct parasite geno- 
types within the mosquito is likely to be fundamental for the evolution 
of virulence’'. The results of this study reveal that vector transmission 
also regulates Plasmodium virulence by modifying parasite gene 
expression, and therefore the mammalian immune response, in the 
erythrocytic cycle. This is probably the outcome of a combination of 
distinct regulatory processes acting at multiple stages of the parasite 
life cycle, in both the mosquito vector and the mammalian host. It is 
therefore important to delineate the timing and mechanism(s) of regu- 
lation of parasite gene expression, in the context of the complete 
Plasmodium life cycle, to understand the molecular regulation of parasite 


Figure 4 | Mosquito transmission of 

P.c. chabaudi AS modifies parasite gene 
expression in the erythrocytic cycle. C57BL/6 
mice were injected with 10° SBP Pcc AS or infected 
with Pcc AS through mosquito bite. Parasites were 
isolated after six cycles of the blood-stage infection, 
and at the late trophozoite stage of development 
(98.3% (0.76%) and 97.0% (1.41%) trophozoites for 
SBP and MT samples, respectively (mean with 
s.d.)). Total parasite RNA was extracted and 
sequenced. Those genes differentially expressed 
between SBP and MT Pcc AS were determined; 
genes identified as significantly upregulated in 
blood-stage parasites following mosquito 
transmission (left) versus serial blood passage 
(right) are shown. Each segment represents one 
gene, and genes are categorised according to the 
function of their product and ranked based on fold- 
change (outer circle). The DESeq-normalized 
expression levels for each gene are also shown 
(inner circles). Sepia wedges highlight genes whose 
products are predicted to be exported, or otherwise 
accessible to the mammalian immune system. 
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virulence. Attenuation of parasite virulence following mosquito trans- 
mission associates with modified expression of the pir multi-gene family, 
which is conserved from rodent to human Plasmodium". Importantly, 
vector transmission of cultured Plasmodium falciparum similarly modi- 
fies the composition and frequency of var gene expression”. Regulation 
of antigenic variants by vector transmission is therefore universal’”°”’, 
and vector transmission will therefore universally regulate immune con- 
trol of Plasmodium virulence. The interrelationship between the vector, 
parasite and mammalian immune system thus underpins the pathogen- 
esis of malaria. 


METHODS SUMMARY 


P.c. chabaudi AS and CB were mosquito transmitted (MT) and cloned at the 
University of Edinburgh, UK, and sent to NIMR in 1978 and 1982, respectively. 
Parasites were serially blood passaged (SBP) through mice 26-32 times before use 
in this study. To initiate infections with SBP parasites, mice were injected intra- 
peritoneally (i.p.) or intravenously (i.v.) with 10*-10° pE derived from cryopre- 
served stocks. Alternatively, SBP parasites were transmitted through Anopheles 
stephensi and mice were infected by mosquito bite, with an estimated 9.15 infective 
bites per mouse’. We therefore directly compared SBP parasites before and after 
mosquito transmission. The first 52h of infection initiated by mosquito bite was 
required to complete the pre-erythrocytic stages’’; the erythrocytic cycle thus 
started on day 2 post-infection. To bypass the pre-erythrocytic stages, and control 
the dose initiating the blood-stage infection, mice were injected i.p. or i.v. with 
10*-10° pE derived from recently MT parasite lines that were just one blood 
passage from mosquito transmission (unless otherwise stated). The course of 
infection was monitored on thin blood smears by enumerating the percentage 
of erythrocytes infected with asexual parasites (parasitaemia). The limit of detection 
for patent parasitaemia was 0.01% infected erythrocytes. To determine chroni- 
city of infection, 100 ul blood was sub-inoculated into RAG KO mice; absence of 
parasitaemia in recipient mice after 14 days indicated clearance of infection in 
donor mice. Rechallenge studies were initiated = 90 days after the first infection, 
when = 95% C57BL/6 mice had naturally cleared blood-stage parasites. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Mice. Inbred wild-type, major histocompatibility complex class II knockout 
(MHC II KO)” and recombination activation gene 1 knockout (RAG KO)** 
C57BL/6 mice were bred under specific pathogen-free conditions at NIMR. All 
experiments were performed in accordance with UK Home Office regulations 
(PPL 80/2358) and approved by the ethical review panel at NIMR. Experi- 
mental mice were age- and sex-matched, housed under reverse light conditions 
(light 19:00-07:00, dark 07:00-19:00) at 20-22 °C, and had continuous access to 
mouse breeder diet and water. Measurements of clinical pathology were taken at 
16:00. Core body temperature was measured with a rectal thermometer; body 
weight was calculated relative to a baseline measurement taken on day —2; and 
erythrocyte density was determined on a VetScan HMII haematology system 
(Abaxis). To measure liver enzymes, plasma was analysed on a VetScan 
Chemistry Analyzer, using a Mammalian Liver Profile reagent rotor (Abaxis). 
Enumeration of blood-stage parasites by real-time PCR. Whole blood was 
isolated 20h after liver merozoite egress, when parasites were at the late tropho- 
zoite stage of development and within the first cycle of schizogony. Total RNA was 
extracted by acid guanidinium thiocyanate-phenol-chloroform extraction”, and 
reverse transcribed by PCR at 42 °C using 75 U MuLV reverse transcriptase and 
2.5 1M random hexamer primers (both Applied Biosystems) per sample. Parasites 
were quantified by real-time PCR, comparing P. c. chabaudi AS 18S ribosomal 
RNA copy number between samples and a standard curve of pE prepared at the 
late trophozoite stage of development. The reaction mix contained TaqMan 
Universal PCR Master Mix (Applied Biosystems), 300nM forward primer (5’- 
AAGCATTAAATAAAGCGAATACATCCTTAT-3’), 300nM reverse primer 
(5'-GGGAGTTTGGTTTTGACGTTTATGCG-3’) and 50 nM probe (5’-6FAM- 
CAATTGGTTTACCTTTTGCTCTTT-TAM-3’). Real-time PCR amplification 
was performed on an ABI Prism 7000 Sequence Detection System (Applied 
Biosystems), with a temperature profile as follows: 50°C for 2 min, followed by 
95°C for 10 min, and then 40 cycles of 95°C for 15s and 60 °C for 1 min. Parasite 
numbers were determined per 100 1l whole blood; total circulating parasites were 
then calculated for each mouse based on their weight and, therefore, their estimated 
circulating blood volume. 

Flow cytometry. Single-cell suspensions of splenocytes were prepared, erythro- 
cytes lysed, and cells enumerated on a haemocytometer. Cells were stained with 
monoclonal antibodies (CD3¢ biotin or PerCP-Cy5.5 (145-2C11); CD4 Pacific 
Blue (RM4-5); CD8« Pacific Blue (53-6.7); CD11b Pacific Blue (M1/70); CD11c 
APC (N418); CD44 FITC (IM7); I-A® FITC (AF6-120.1); Ly-6G PE (1A8); NK-1.1 
biotin (PK136); TER-119 biotin (all from BioLegend)) (CD19 biotin (1D3); 
CD62L APC (MEL-14); CD127 PE (A7R34) (all from eBioscience)) (Ly-6C 
Alexa Fluor 647 (ER-MP20) (from AbD Serotec)) or irrelevant isotype-matched 
monoclonal antibodies as negative controls. PerCP/Cy5.5-conjugated streptavidin 
(BD Biosciences) was used secondary to biotinylated antibodies. For phenotypic 
analysis, samples were acquired on a CyAn (Beckman Coulter), and data were 
analysed with FlowJo software (TreeStar). 

Antibodies and cytokines. Malaria-specific antibodies were measured in plasma 
by enzyme-linked immunosorbent assay. 96-well PolySorp plates (Nunc) were 
coated with 50 j1gml | parasite lysate prepared from pE isolated from C57BL/6 
mice infected with SBP P. c. chabaudi AS; twofold serial dilutions of plasma from 
uninfected and hyper-immune mice were used as negative and positive controls, 
respectively, for experimental samples; alkaline phosphatase-conjugated goat anti- 
mouse IgG, IgG1, IgG2c, IgG2b and IgG3 (all from SouthernBiotech) were used 
for detection. Samples were developed with 1 mg ml! 4-nitrophenyl phosphate 
disodium salt hexahydrate (Sigma) and attenuance was measured at 405 nm. 
Antibody concentrations are presented as arbitrary units (AU) relative to 
hyper-immune plasma. Cytokines were measured in plasma by LEGENDplex 
Luminex custom assay (BioLegend). 

Plasmodium RNA preparation. C57BL/6 mice were infected with SBP 
P.c. chabaudi AS through injection of infected erythrocytes, or through mosquito 
bite. Parasites were isolated at exactly 20 h into the seventh cycle of the blood-stage 
infection, at the late trophozoite stage of development, as follows. Whole blood 
was depleted of leukocytes by Plasmodipur filtration (EuroProxima); erythrocytes 
were centrifuged at 400g for 10 min and lysed with 0.15% (w/v) saponin (Sigma). 
Samples were centrifuged at 1,000g for 5 min and washed with PBS; parasites 
were resuspended in TRIzol (Life Technologies) and snap-frozen on dry ice. We 


prepared three biological replicates of SBP P. c. chabaudi AS from eight mice each, 
and two biological replicates of MT P. c. chabaudi AS from 30 mice each. RNA was 
extracted as described”®, resuspended in water and DNA removed with a TURBO 
DNA-free Kit (Applied Biosystems), according to the manufacturer’s instructions. 
RNA quantity/quality was determined on an Agilent 2100 Bioanalyzer RNA 6000 
Nano chip. 

Amplification-free RNA-seq libraries. PolyA+ transcripts were selected from 
10 pg total RNA using Sera-Mag Oligo(dT)-coated Magnetic Particles (Thermo 
Scientific). RNA was diluted with water to a volume of 130 ul and fragmented to 
approximately 200 nucleotides using Covaris Adaptive Focused Acoustics tech- 
nology (settings: 5% duty cycle; intensity 5; 200 cycles per burst for 60 s). The RNA 
was ethanol-precipitated and resuspended in 10 jl water. First-strand cDNA was 
synthesized with Random Hexamer primers and SuperScript II Reverse 
Transcriptase (Life Technologies), following the manufacturer’s instructions. 
Second-strand cDNA synthesis, end repair and dA-tailing were performed using 
the NEBNext mRNA library kit for Illumina (New England Biolabs), eluting in a 
final volume of 15 il. Sequencing templates were prepared by mixing 15 pl cDNA, 
5 pl 33 4M adaptors (based on the published adaptor” with the addition of bar- 
code sequences; oligonucleotides supplied by Integrated DNA Technologies), 
25 ul Quick Ligation buffer and 51 Quick DNA ligase (both from New 
England Biolabs) and incubating for 15min at 25°C. Excess adaptors were 
removed with two rounds of clean up with 50,1 of Agencourt AMPure XP 
Beads (Beckman Coulter). Final libraries were eluted in 30 pl water, visualized 
on an Agilent Bioanalyzer 2100 High Sensitivity DNA chip and quantified by 
qPCR. A pool of the five indexed libraries was sequenced on an Illumina 
HiSeq2000, with 100-bp paired-end reads. 

Analysis of RNA expression. Paired-end RNA sequencing reads were mapped to 
the P.c. chabaudi AS reference genome (September 2012 release: ftp://ftp.sanger. 
ac.uk/pub/pathogens/P_chabaudi/September_2012/) using Tophat* v1.4.1, with 
appropriate fragment size parameters and maximum intron size 10000. Read counts 
per gene were calculated using in-house Perl scripts and non-uniquely mapping reads 
were excluded. Six genes with less than 20% unique coding sequence (kmer = 100) 
were excluded from the analysis (PCHAS_073210; PCHAS_083750; PCHAS_ 100020; 
PCHAS_113280; PCHAS_113290; PCHAS_130130). Differential gene expression 
between SBP and MT P.c. chabaudi AS was determined using DESeq”; the three 
SBP P.c. chabaudi AS replicates were compared against the two MT P. c. chabaudi 
AS replicates to determine genes upregulated in blood-stage parasites following mos- 
quito transmission, and vice versa to determine genes upregulated following 
serial blood passage. In both cases a corrected P value cutoff of 0.01 was applied. 
The resulting gene lists were categorised into ‘cir’ (based on published annotation”), 
“Pc-fam’ (based on GeneDb annotation by Ulrike Boehme at the Wellcome Trust 
Sanger Institute, with minor reannotation), Exported’ (based on known biology or 
ExportPred* prediction), ‘Other known function’ and ‘Unknown function’. For those 
genes within the category ‘Other known function’, we sub-categorised genes based on 
enriched biological process GO terms using TopGO”! a P value cutoff of 0.01 was 
applied. We independently added ‘glideosome’ as a sub-category. 
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Negligible impact of rare autoimmune-locus 
coding-region variants on missing heritability 
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Genome-wide association studies (GWAS) have identified common 
variants of modest-effect size at hundreds of loci for common auto- 
immune diseases; however, a substantial fraction of heritability 
remains unexplained, to which rare variants may contribute’. To 
discover rare variants and test them for association with a pheno- 
type, most studies re-sequence a small initial sample size and then 
genotype the discovered variants in a larger sample set*°. This 
approach fails to analyse a large fraction of the rare variants present 
in the entire sample set. Here we perform simultaneous amplicon- 
sequencing-based variant discovery and genotyping for coding 
exons of 25 GWAS risk genes in 41,911 UK residents of white 
European origin, comprising 24,892 subjects with six autoimmune 
disease phenotypes and 17,019 controls, and show that rare coding- 
region variants at known loci have a negligible role in common 
autoimmune disease susceptibility. These results do not support 
the rare-variant synthetic genome-wide-association hypothesis® 
(in which unobserved rare causal variants lead to association 
detected at common tag variants). Many known autoimmune dis- 
ease risk loci contain multiple, independently associated, common 
and low-frequency variants, and so genes at these loci are a priori 
stronger candidates for harbouring rare coding-region variants 
than other genes. Our data indicate that the missing heritability 
for common autoimmune diseases may not be attributable to the 
rare coding-region variant portion of the allelic spectrum, but per- 
haps, as others have proposed, may be a result of many common- 
variant loci of weak effect”. 

Recent large-scale human sequencing studies have revealed an 
abundance of rare variants (which we define as minor allele frequency 
(MAF) < 0.5%) and shown that these are geographically localized and 
are more likely to have deleterious functional consequences'"”. In the 
largest sample size studied to date’’, 202 genes in 14,002 people were 
re-sequenced, and ~95% of exonic variants identified were found to be 
rare, with 74% observed in only one or two subjects. More broadly, 
across ~ 15,000 genes, similar findings were observed in recent exome- 
sequencing studies of 2,440 and 6,515 subjects’*"*. Importantly, these 
studies demonstrate that even if we had reference variation databases 
from a million subjects, most of the rare-variant allelic spectrum of any 
given sample set (for example, a case-control cohort) will be unique 
and only identifiable by direct re-sequencing of the entire sample set. 


There are only a handful of published examples of rare coding-region 
variants associated with common autoimmune diseases (although 
many examples in familial/Mendelian immune-mediated diseases). 
Coding-region variants in IFIH1 associated with type 1 diabetes 
(MAF in controls = 0.67-2.2%)*, TYK2 with multiple autoimmune 
diseases’* and IL23R with inflammatory bowel disease’, for example, 
are low frequency (which we define as MAF = 0.5-5%) rather than 
particularly rare. In other examples, the existing evidence for asso- 
ciation, and/or the effect sizes, are relatively weak (for example, 
CARD14 and psoriasis'*®, IL2RA and IL2RB and rheumatoid arthritis'’). 
The association of rare coding-region variants of NOD2 (also known as 
CARD15) in Crohn’s disease probably provides the best example, 
albeit three low-frequency variants comprise over 80% of all the dis- 
ease-causing mutations'®. Most of the studies also lose power (especially 
for tests in which multiple rare variants are pooled into a single analysis, 
for example by gene) by initially sequencing only a small sample subset 
rather than testing the entire rare-variant content ofa large case-control 
sample set. We sought to improve on these methods by performing 
highly multiplexed sequencing of sufficiently high quality to enable 
direct genotyping in the entirety of a large autoimmune disease case- 
control collection. 

We selected subjects from a single population—individuals of white 
Northern-European ethnicity living in the UK (Methods)—to mini- 
mize any effects of population stratification. We selected to re- 
sequence all RefSeq exons for 25 genes from 20 GWAS-identified risk 
loci showing overlap between six common autoimmune disease phe- 
notypes (autoimmune thyroid disease, coeliac disease, Crohn’s disease, 
psoriasis, multiple sclerosis and type 1 diabetes). All genes studied were 
from risk loci for at least two phenotypes, all genes had known immune 
system function, 18 out of 20 loci had either a single candidate immune 
gene or all immune genes at a locus were selected (the remaining two 
loci had partial transcripts of another immune gene within the 0.1 cen- 
timorgan (cM) linkage disequilibrium block), and all genes and loci 
were densely genotyped on the Illumina ImmunoChip (Supplemen- 
tary Table 1)'’. We attempted high-throughput sequencing of 52,224 
samples (including positive and negative controls, and repeats). We 
performed extensive quality control on both samples and variant calls 
(Methods). The final data set comprised 41,911 phenotyped indivi- 
duals (autoimmune disease cases and controls), with ImmunoChip 
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Table 1| Variant types in protein-coding regions of 25 genes in 
41,911 phenotyped individuals 


Variant type All variants Rare Novel} 
(MAF <0.5%)* 

Nonsynonymous SNV 1,792 1,758 1,379 
Splicing SNV 86 85 65 
Stopgain SNV 47 47 42 
Synonymous SNV 1,024 972 674 
Frameshift indels al 31 31 
onframeshift indels 10 10 10 
Total variants 2,990 2,903 2,201 
Singleton 1,602 1,598 1,411 
Doubleton 470 468 378 


umbers shown are after quality-control steps. Annotation performed with GENCODE V14 gene 
definitions. Triallelic (n = 124) and quadrallelic (n = 3) sites (combined SNVs and indels) are shown as 
multiple separate variants with the appropriate annotation for each non-reference allele. 

* MAF in 17,019 sequenced controls. 

Not seen in dbSNP137, or 1000 Genomes Project (April 2012 release), or NHLBI (data release 
ESP6500Sl, with 6,503 individuals). 


array genotypes available for 32,806 of these individuals (Supplemen- 
tary Table 2). We discovered 4,377 variant sites across all amplicons, 
and the genotype call rate was 99.9989% (reference homozygote as well 
as non-reference genotypes) across 41,911 individuals. Of these, 2,990 
variants were in protein-coding regions (including exon splice sites) of 
the 25 genes (Table 1 and Supplementary Table 3); 97.1% of which are 
rare (MAF in 17,019 controls, <0.5%); 73.6% are novel when compared 
with current published datasets (dbSNP137, 1000 Genomes Project, 
National Heart, Lung, and Blood Institute (NHLBI)) containing >6,000 
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individuals and 67.3% are novel compared to an unpublished data set 
of 25,994 exome-sequenced individuals (D. G. MacArthur, personal 
communication); and 68.9% were only seen in one (singleton) or two 
(doubleton) individuals. These proportions of novel, and rare, variants 
are similar to recent data from other large re-sequencing studies’. 

Our very high coverage data (99.8% of 183.4 million (site X sample) 
genotype calls had a read depth of =40 and 96.6% had a read depth of 
>100; Supplementary Fig. 1) enabled stringent data filtering on call 
rate per sample, per variant site, and other criteria (Methods). To 
confirm data quality, we performed further experiments and analyses 
as follows: (1) we genotyped one control sample 296 times (on differ- 
ent 48-sample microfluidic chips), and the genotype call error rate 
was two non-consensus genotype calls of 1,295,581 called genotypes 
(0.00015%); (2) 32,806 out of 41,911 subjects also had dense 
ImmunoChip genotyping data at the 25 genes, and genotype concord- 
ance at 91 variant sites genotyped on both platforms was 99.994%; (3) 
transition/transversion (Ti/Tv) rates, a quality-control measure based 
on expected human mutation types, were 2.434 at coding-region 
variants (2.427 at singletons), 2.44 at rare (MAF <0.5%) variants 
(2.437 at singletons) and 2.275 at novel variants (2.273 at singletons) 
(definitions in Table 1); (4) we selected all (35) nonsense single nuc- 
leotide variants (SNVs) and all (39) frameshift insertions/deletions 
(indels) in the ImmunoChip-genotyped samples for Sanger sequen- 
cing: two variants failed assay/PCR (polymerase chain reaction) design 
and there was one false-positive SNV and one false-positive indel 
(overall false-positive rate = 2.8%). All 70 validated SNVs and indels 
had the same alleles in high-throughput and Sanger-sequencing 
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Figure 1 | Association analyses of discovered rare functional variants in 
autoimmune diseases. We define rare functional variants as MAF < 0.5% in 
17,019 controls and predicted nonsynonymous, premature-stop or splice-site 
annotation. Quantile-quantile plots compare observed versus expected test- 
statistic distributions, with shading indicating 99% confidence intervals. Full 
results are available in Supplementary Data. Each of six individual diseases, and 
all autoimmune diseases combined, were tested as phenotypes. a, Gene-based 
C-alpha test (25 genes by 7 phenotypes, n = 41,911 subjects) allowing for both 
risk and protective effects for rare functional variants. Singleton variants pooled 
into a single binomial count per phenotype. b, Gene-based burden tests (25 
genes by 7 phenotypes, n = 41,911 subjects) comparing summed allele counts 
for rare functional variants in cases versus controls with Fisher’s exact test. 
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Expected (—log,,(P)) 
175 tests (25 genes by 7 cohorts) 
(functional variants with MAF < 0.5%) 


c, Conditional gene-based burden test (25 genes by 6 phenotypes, n = 32,806 
subjects): rare functional-variant allele counts are summed for each individual 
per gene and introduced in a logistic regression, including ImmunoChip 
covariates for multiple independent top (common) variant signals selected on 
the basis of a stepwise regression (down to P > 10~*). The psoriasis phenotype 
was not tested as most samples do not have ImmunoChip data. d, Count of 
case-unique rare alleles (UNIQ) tests (25 genes by 7 phenotypes, n = 41,911 
subjects): compares the number of rare functional variants only observed in 
cases with the distribution of this value upon random permutation (10,000 
times) of the phenotypes. e, Count of control-unique rare alleles (UNIQ) tests: 
same as d but for rare functional variants uniquely observed in controls. 
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assays; (5) proportions of rare, and of known, variants were similar to 
those found by other large sequencing studies, and we identified no 
common or low-frequency novel variant sites. 

We first attempted to identify any low-frequency or rare variants of 
larger effect. We performed for each coding-region variant and each of 
seven phenotypes (including all autoimmune disease cases combined) 
a single-variant association analysis. Only previously reported loci 
were observed with common variants (MAF > 5%), as expected. We 
identified three low-frequency (MAF = 0.5-5%) and rare (MAF in 
17,019 controls = <0.5%) exonic variants with single SNP association 
P<10 “(chosen asa partial Bonferroni multiple testing correction for 
25 genes and 7 phenotypes, but not correcting for all variants per 
gene) (Supplementary Table 4 and Supplementary Data). We next 
analysed low-frequency and rare exonic variants, conditioning on 
common-variant non-coding signals at each locus, and observed no 
additional association signals (Supplementary Data). An association 
between type 1 diabetes and the low-frequency UBASH3A SNP 
rs17114930 was observed, but conditional regression analysis showed 
this signal to be secondary to a stronger common-frequency variant/ 
haplotype previously identified by GWAS”. We identified novel low- 
frequency (nearly ‘common’ as MAF in 17,019 controls = 4.97%) 
NCEF2 coding-region variant associations with coeliac disease at two 
SNPs (1rs17849502, nonsynonymous; rs17849501, synonymous; in 
almost complete linkage disequilibrium 1 = 0.992). Both variants 
were present on the Illumina ImmunoChip, but just failed quality- 
control criteria in our previous coeliac disease study owing to missing 
data’’. We replicated the UK findings in 4,313 coeliac cases and 3,954 
controls (European samples, Methods; rs17849502 P = 4.46 X 10 ° 
(Cochran—Mantel—Haenszel test), odds ratio 1.35 (95% CI = 1.17- 
1.55)). Logistic regression analysis conditioning on rs17849502 in 
the UK re-sequencing data set revealed no further single-variant coel- 
iac disease association signals below P< 10 *. NCF2is a component of 
the neutrophil NADPH oxidase respiratory burst complex. Different 
disease-causing mutations cause the recessive Mendelian phenotype 
chronic granulomatous disease. The rs17849502/H389Q variant is 
also associated with the autoimmune disease systemic lupus erythe- 
matosus”'. Functional studies have shown that the minor allele of 
1s17849502/H389Q reduces the binding efficiency of NCF2 to the 
guanine nucleotide-exchange factor VAV1 (ref. 21). These data now 
implicate a disease mechanism of impaired neutrophil function in 
coeliac disease, a condition previously thought to be of predominantly 
B- and T-cell-mediated immunopathogenesis, and where neutrophils 
may have a role in regulating adaptive immunity”. 

We noted that even with ~7,000 cases and ~17,000 controls the 
power to detect association signals using single-variant tests for variants 
(MAF < 0.5%) of modest effect (for example, odds ratio < 3) is limited 
(Supplementary Fig. 2) and therefore we performed gene-based pooled- 
variant association tests to better detect the combined effect of multiple 
variants. We defined coding-region variants as functional candidates if 
the variants were rare (MAF in 17,019 controls = <0.5%) and predicted 
to be of potential functional impact (nonsynonymous, premature stop, 
splice-site altering; see Methods). We pooled variants (by gene) in ana- 
lyses to detect different scenarios (Fig. 1 and Supplementary Data), 
including the C-alpha test, which can detect a combination of risk and 
protective variants; burden tests to detect either an excess of risk variants 
in cases or protective variants in controls; a modified version of the 
burden test using conditional regression and common-variant non- 
coding signals at a locus as covariates; a test to detect an excess of rare 
variants seen uniquely in cases (the case or control unique tests being 
particularly suitable for the study of the large numbers of singleton and 
doubleton variants we observe); and a test to detect an excess of rare 
variants seen uniquely in controls. The distribution of association stat- 
istics for all five pooled gene tests across each of the six or seven pheno- 
types tested was consistent with the global null of no association. 

On the basis of these results, in the largest (to the best of our know- 
ledge) human disease sample sequencing study to date, we find little 
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support for a significant impact of rare coding-region variants in 
known risk genes for the autoimmune disease phenotypes tested. 
Our data provide little stimulus in support of large-scale whole-exome 
sequencing projects in common autoimmune diseases. Using average 
genetic-effect estimates from our data (Methods), over all loci and 
phenotypes we have tested, we estimate that rare variants contribute 
to less than 3% of the heritability explained by common variants at 
these known risk loci”. 


METHODS SUMMARY 


Sequencing. DNA (corresponding to exonic sequence of 25 autoimmune disease 
risk genes) was PCR-amplified in a multiplexed microfluidics assay (Fluidigm 
Access Array). PCR amplicons from a sample were pooled, and barcoded with 
one of 1,536 unique ten-base-pair sequences. Libraries of 1,536 samples were 
sequenced on Illumina HiSeq instruments. Reads were aligned to the GRCh37 
human reference and SNVs and small indels called. Samples and called variants 
were extensively filtered on the basis of call rate and other criteria. Selected variants 
were validated by Sanger dideoxy sequencing. Genotype data from Illumina 
ImmunoChip array-based genotyping was merged with Fluidigm sequencing- 
based genotypes. 

Statistical analysis. Statistical analysis was performed in R, and using PLINK/SEQ 
software. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Gene selection. All genes studied (listed in Supplementary Table 3) were risk loci 
for at least two phenotypes, had a known immune system function, were from loci 
with only a single strong candidate immune gene (or all immune genes were 
selected at four loci: ILI8R1, ILI8RAP; CTLA4, CD28, ICOS; IL2, IL21; PTPRK, 
THEMIS), and all genes and loci were densely genotyped with all 1000 Genomes 
pilot project variants on the Illumina ImmunoChip (for design of this chip, see 
ref. 19). Additional criteria favouring locus selection were: known multiple inde- 
pendent association signals, risk (not necessarily same variants/haplotype or signal 
direction) for many autoimmune diseases, fine-mapping or other data strongly 
suggesting a single candidate gene, and smaller complementary DNA size. 
Samples. UK samples for the six component immune disease phenotypes have 
been described in previous publications (which also contain full details of Ethics 
Committee approvals)'”’°**”’, as have the three control populations'””*. Informed 
consent was obtained from all subjects. Individuals with self-reported autoimmune 
disease were excluded from the UK Blood Services — Common Controls and NIHR 
Cambridge Biomedical Research Centre Cambridge BioResource controls. Samples 
with self-stated non-white European ethnicity were excluded (later further con- 
firmed by ImmunoChip-based principal component ethnicity analysis for 32,806 
samples). Samples with gross discordance with ImmunoChip genotypes and/or 
with known gender or genotype-mismatch issues from previous GWAS were 
excluded. Samples with known duplicates or relatedness (as distant as first cousins) 
were excluded, relatedness was later confirmed by ImmunoChip genome-wide 
identity-by-state analysis and by analysis of multiple rare-variant sharing in 
Fluidigm sequencing data. Additional independent European samples genotyped 
for rs17849502 (4,313 coeliac cases and 3,954 controls) were previously described’’. 
Wet-lab. PCR primers were designed for all RefSeq exons of 26 genes, and ampli- 
cons selected to be 150-200 base pairs (bp) in size. There was minor primer design 
dropout at IL18R1, STAT4, THEMIS and ZMIZ1, although >94% of exon sequence 
was still covered at these genes. Variant calls at the gene YDJC later proved unre- 
liable with highly biased allele depths at heterozygote sites, probably due to the very 
high exon GC content (~70%), and this gene was not further analysed nor is it 
discussed elsewhere in this study. The total length of (overlapping) amplicons was 
95,927 bp; with primers removed (still overlapping) 72,612 bp; and with primers 
removed and unique sequence 58,550 bp. PCR amplification was performed using 
50ng genomic DNA per sample on the 48 sample/plate Fluidigm microfluidic 
Access Array system. PCR primers for 511 PCR reactions were pooled up to 12- 
plex per well in 48 pools. Individual per sample per pool PCR reactions took place in 
~35-nl reaction chambers with ~300 DNA haplocopies per reaction. All pools per 
sample were combined. Each sample’s pool was then individually barcoded in a 
second PCR reaction with one of 1,536 10-bp Fluidigm-designed unique barcodes 
(Fluidigm unidirectional sequencing protocol). 

Sequencing. Thirty-four libraries (each of 1,536 barcoded samples) were gener- 
ated. Libraries were first sequenced on an Illumina MiSeq for rapid quality control 
of the barcoding step, and to optimize loading concentrations/cluster density. 
Libraries were then sequenced one per lane using 101-bp paired-end reads and 
an 11-bp index read (the last base of each read being only used for chemistry cycle 
phasing purposes) on Illumina HiSeq sequencers. Lanes were repeated if target 
cluster density or target clusters passing filter were not achieved. Individual sam- 
ples were de-multiplexed by Illumina CASAVA software, allowing zero mis- 
matches per 10-bp barcode. Sanger sequencing was performed on PCR 
products using an ABI 3730xl DNA analyser and ABI big dye terminator 3.1 cycle 
chemistry. We sequenced all samples with rare-variant allele genotypes, and a 
control sample, for the 74 sites selected. 

Bioinformatics. PCR primers were trimmed from the 5’ end of individual reads 
using a modified version of btrim”’. Trimmed sequences were aligned to the 
GRCh37 human reference genome using gapped quality-aware alignment, and 
base call quality recalibration implemented in Novoalign V2.07.18 with settings ‘-t 
100 -H -g 65 -x 7 -o FullNW. Data were realigned against known (1000 Genomes 
and Mills-Devine 2-hit) indels and per-sample called indels. SNPs were called 
using GATK 1.6-5 and settings “-min_base_quality_score 15 -stand_call_conf 
30-baq CALCULATE_AS_NECESSARY -glm SNP-baqGapOpenPenalty 65- 
downsampling_type BY_SAMPLE-downsample_to_coverage 250’ and then hard 
filtered using GATK settings ‘QUAL<80.0 DP<20 MQ<40.0 QD<2.0 
MQRankSum<-12.5 HRun>5’ (several other recommended best practice 
GATK settings were not appropriate for PCR amplicon data), and around indels. 
Small indels (up to 15-bp gaps from Novoalign) were called using GATK and 
settings “-min_base_quality_score 15 -stand_call_conf 30-baq CALCULATE_ 
AS_NECESSARY -glm INDEL-baqGapOpenPenalty 65-downsampling_type 
BY_SAMPLE-downsample_to_coverage 250’ and then hard filtered using 
GATK settings ‘QUAL<80.0 DP<20 QD<2.0° (several other recommended 
best-practice GATK settings were not appropriate for PCR amplicon data). The 
most important of these settings were likely to be calling genotypes as missing with 


sequencing depth <20 high-quality bases and the minimum Phred 15 recalibrated 
base call quality score to define high-quality bases. Both SAMtools and VCFtools 
software were also used to process data. SNP genotypes (including non-reference 
genotypes) were called at all 58,550 bases of amplicon sequence. Samples with 
<57,600 SNP genotype calls (98.4%, a threshold determined by inspection of the 
call rate plot) were removed and scheduled for repeat processing. Clusters of very 
close non-reference genotypes in an individual sample were removed. Non- 
reference genotype sites were then identified across all samples, and VCF-level 
data reduced to variants at polymorphic sites (in one or more samples). A com- 
bined VCF file of all polymorphic sites and samples was then loaded into PLINK/ 
SEQ v0.09. Multiple-step filtering based on call rate per sample and call rate per 
variant site was applied, with final requirements >99.95% call rate per sample and 
per variant site. Lower call rate samples at this stage were also scheduled for repeat 
processing. We removed variants if the sum of heterozygote genotype allele depths 
was <25% or >75%. The final filtered data was then exported to a VCF file 
containing all variants and samples for analysis in R. ImmunoChip data was 
loaded into Illumina GenomeStudio software from .idat files, and all samples 
called together in GenomeStudio using the cluster settings as previously 
described’. Data were merged with HapMap Phase 3 genotypes, principal com- 
ponent analysis performed, and the first two principal components used to val- 
idate ethnicity (Supplementary Fig. 3). 

Barcode and sequencing amplicon performance. Barcode evenness was excel- 
lent, with typically 99.0% of the 1,536 barcodes producing pass-filter read numbers 
that were between 0.033% and 0.13% of the total pass-filter reads per lane (0.065% 
expected), with most of the failing barcodes tagging known water-negative control 
samples or (based on repeat amplification with a different barcode) due to poor 
DNA quality. Amplicon evenness was good, and for many genotype calls we 
were required to downsample data to 250 bases per site per sample 
(Supplementary Fig. 1). However, 10 of 511 amplicons effectively failed PCR. In 
a typical analysis of 100 high-quality samples, 2% of the 58,550 unique amplicon 
bases had a minimum mean read-depth of <20, nearly all accounted for by the 10 
failing amplicons. 

Variant annotation. Annotation of all variants was first performed using 
ANNOVAR (Feb 2013) and the GENCODE V14 data set. Coding variants were 
identified. Rare functional variants were identified based on stop, frameshift indel, 
nonsynonymous (SNV or 37 indel) or splice predictions. We performed an addi- 
tional layer of annotation for high confidence loss of function mutations, using the 
methods described in ref. 30. The Variant Effect Predictor (VEP v2.5) tool from 
Ensembl was modified to produce custom annotation tags and additional loss of 
function (LOF) annotations. The additional LOF annotation was applied to var- 
iants which were annotated as STOP_GAINED, SPLICE_LDONOR_VARIANT, 
SPLICE_ACCEPTOR_VARIANT, and FRAME_SHIFT and flagged if any filters 
failed. Filters included: LOF is the ancestral allele; exon is surrounded by non- 
canonical splice site (that is not AG/GT); LOF removes less than 5% of remaining 
protein; LOF is rescued by nearby start codon which results in less than 5% of 
protein truncated; transcript only has one coding exon; splice-site mutation within 
intron smaller than 15 bp; splice site is non-canonical OR other splice site within 
same intron is non-canonical; unable to determine exon/intron boundaries sur- 
rounding variant. A LOF variant is predicted as high confidence if there is one 
transcript that passes all filters, otherwise it is predicted as low confidence. We 
noted that LOF mutations were seen in 21 out of 25 genes, all were heterozygous 
genotypes, and mainly (87 out of 97) as singletons or doubletons in the 41,911 
samples (Supplementary Table 3). 

Statistical analysis. Most analysis was performed in R using custom code (avail- 
able on request). For tests using permutations (C-alpha, UNIQ-cases and UNIQ- 
controls in Fig. 1), we randomly permuted in R the case-control status 10,000 
times. The unconditional burden test (Fig. 1b) used a Fisher’s exact text. 
Conditional burden tests used the glm function in R, including selected 
ImmunoChip common variants as covariates (selection based on a stepwise 
regression analysis up to 10 *). For the C-alpha statistic computation (Fig. 1a), 
the expected proportion of rare alleles in the case-control cohorts was set to the 
proportion of cases and controls. Figure 1 was generated using the fact that under 
the null of no association —2log(P) is distributed as chi-squared with 2 degrees of 
freedom. PLINK/SEQ v0.09 (http://atgu.mgh.harvard.edu/plinkseq/index.shtml) 
was used for Ti/Tv statistics, and to confirm findings of R analyses (not shown). 
We used PLINK/SEQ for the genotype concordance analysis between Immuno- 
Chip and Fluidigm-sequencing data. Discordant calls were observed at 169 of 
2,985,255 (0.0056%) genotypes, occurring at 36 out of 91 polymorphic variant 
sites present in both data sets. We inspected Illumina ImmunoChip R theta 
intensity plots for the discordant genotypes, and observed 8 discordant genotypes 
to be likely due to ImmunoChip data mis-clustering, and 11 discordant genotypes 
to be due toa third or fourth observed allele in the high-throughput sequencing data. 
At the sites with third and fourth alleles, we note the ImmunoChip array assays can 
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only call two alleles, therefore is not possible to determine whether these sequence 
genotype calls are real or errors. R code used for analysis is available from V.P. 

Estimation of average genetic effect contributed by rare variants. For each 
combination of locus by disease, we combined all rare functional variants (frequency 
< 0.5% in 1,000 Genomes/NHLBI data sets and nonsynonymous, LOF or splicing) 
in a burden statistic X and computed the combined frequency of X in the sample. 
Using a logistic regression model with the disease phenotype as outcome, we esti- 
mated the odds ratio associated with the burden variable X. This knowledge of 
frequency and odds ratio for the burden variable X enables the estimation of the 
average genetic effect (AGE, as defined in ref. 23) version of the variance explained. 
We then compared this variance at each combination of locus/gene with the variance 
explained by what we consider to be a typical common variant association (odds 
ratio 1.2, MAF 20%, assuming a single common variant per locus). To deal with the 
uncertainty in estimated odds ratio and obtain a confidence interval for this value, we 
randomly sampled the odds ratio from their estimated distribution for each pair of 
disease/locus. Averaging over the 150 combinations of 6 diseases by 25 loci, we 
estimate the ratio of heritability explained for all rare variants by all common variants 
to have a mean value of 1.6%, with a confidence interval of (1.2-2.3%). It is pointed 
out in ref. 23 that the AGE estimate can underestimate the true explained variance by 
rare variants. Nevertheless, assuming that rare variants are generally all risk or all 
protective at a given gene, their simulations also show that the underestimation is 
limited, in the range of a 25% decrease. Taking this conservative estimate of the 
under-estimation level, we find the upper bound of the 95% of the confidence 
interval to be 3.05%. Hence, our data indicate that the aggregate contribution of rare 
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variants to the heritability (<0.5% MAF, and averaged over these loci/diseases) is 
unlikely to exceed approximately 3% of the heritability assigned to common variants. 
We acknowledge that a much larger underestimation (and therefore a much larger 
heritability explained for rare variants) is possible in the presence of a combination of 
high risk and highly protective rare variants at the same locus. Although we cannot 
exclude such scenario, it is unlikely to be widespread. We also assumed in our 
estimates that rare variants act additively at the log scale. Although this assumption 
is standard, we cannot exclude that a combination of rare variants results in a much 
stronger predictive outcome than rare variants individually, hence underestimating 
the heritability associated with rare variants. 


24. Cooper, J. D. et al, Seven newly identified loci for autoimmune thyroid disease. 
Hum Mol Genet 21, 5202-5208 (2012). 

25.  Jostins, L. et a/. Host-microbe interactions have shaped the genetic architecture 
of inflammatory bowel disease. Nature 491, 119-124 (2012). 

26. Sawcer, S. et al. Genetic risk and a primary role for cell-mediated immune 
mechanisms in multiple sclerosis. Nature 476, 214-219 (2011). 

27. Tsoi, L.C. etal. Identification of 15 new psoriasis susceptibility loci highlights the 
role of innate immunity. Nat Genet 44, 1341-1348 (2012). 

28. Dendrou, C. A. et al. Cell-specific protein phenotypes for the autoimmune locus 
IL2RA using a genotype-selectable human bioresource. Nat Genet 41, 
1011-1015 (2009). 

29. Kong, Y. Btrim: a fast, lightweight adapter and quality trimming program for next- 
generation sequencing technologies. Genomics 98, 152-153 (2011). 

30. MacArthur, D. G. etal. A systematic survey of loss-of-function variants in human 
protein-coding genes. Science 335, 823-828 (2012). 


©2013 Macmillan Publishers Limited. All rights reserved 


1 sid ial Be 


doi:10.1038/nature12172 


Single-cell transcriptomics reveals bimodality in 
expression and splicing in immune cells 


Alex K. Shalek'*, Rahul Satija’*, Xian Adiconis?, Rona S. Gertner', Jellert T. Gaublomme’, Raktima Raychowdhury’, 
Schraga Schwartz’, Nir Yosef’, Christine Malboeuf’, Diana Lu’, J ohn J. Trombetta’, Dave Gennert?, Andreas Gnirke’, 
Alon Goren”, Nir Hacohen**, Joshua Z. Levin’, Hongkun Park!? & Aviv Regev”> 


Recent molecular studies have shown that, even when derived from 
a seemingly homogenous population, individual cells can exhibit 
substantial differences in gene expression, protein levels and 
phenotypic output’, with important functional consequences*”. 
Existing studies of cellular heterogeneity, however, have typically 
measured only a few pre-selected RNAs’” or proteins” simulta- 
neously, because genomic profiling methods’ could not be applied 
to single cells until very recently” '°. Here we use single-cell RNA 
sequencing to investigate heterogeneity in the response of mouse 
bone-marrow-derived dendritic cells (BMDCs) to lipopolysacchar- 
ide. We find extensive, and previously unobserved, bimodal vari- 
ation in messenger RNA abundance and splicing patterns, which 
we validate by RNA-fluorescence in situ hybridization for select 
transcripts. In particular, hundreds of key immune genes are 
bimodally expressed across cells, surprisingly even for genes that 
are very highly expressed at the population average. Moreover, 
splicing patterns demonstrate previously unobserved levels of het- 
erogeneity between cells. Some of the observed bimodality can be 
attributed to closely related, yet distinct, known maturity states of 
BMDCs; other portions reflect differences in the usage of key regu- 
latory circuits. For example, we identify a module of 137 highly 
variable, yet co-regulated, antiviral response genes. Using cells 
from knockout mice, we show that variability in this module 
may be propagated through an interferon feedback circuit, invol- 
ving the transcriptional regulators Stat2 and Irf7. Our study 
demonstrates the power and promise of single-cell genomics in 
uncovering functional diversity between cells and in deciphering 
cell states and circuits. 

To characterize the extent of expression variability on a genomic 
scale and decipher its functional implications, we used single-cell RNA 
sequencing (RNA-Seq) to profile a temporal snapshot of the BMDC 
response to lipopolysaccharide (LPS). This is an attractive model sys- 
tem for single-cell analyses for several reasons. First, LPS, a component 
of Gram-negative bacteria and a ligand of Toll-like receptor 4, strongly 
synchronizes cellular responses and mitigates temporal phasing". 
Second, LPS activation evokes a robust transcriptional program that 
has been extensively investigated at the population level'*. Third, LPS 
stimulation should increase the correlation between mRNA and protein 
levels for induced genes, thus reducing a potentially confounding factor’’. 
Finally, differentiated BMDCs are post-mitotic, largely removing cell 
cycle-dependent transcriptional variation’. 

We stimulated BMDCs with LPS and collected single cells after 
four hours’ (Supplementary Information). Using SMART-Seq’, we 
constructed complementary DNA libraries from 18 single BMDCs 
(S1-S18), three replicate populations of 10,000 cells, and two negative 
controls (empty wells), and sequenced each to an average depth of 27 mil- 
lion read pairs. Negative control libraries failed to align (<0.25%) to the 


mouse genome, and were discarded from all further analyses. Library 
quality metrics, such as genomic alignment rates, ribosomal RNA con- 
tamination, and 3’ or 5’ coverage bias, were similar across all libraries 
(Supplementary Table 1). We estimated expression levels for all Univer- 
sity of California Santa Cruz (UCSC)-annotated genes using RSEM 
(RNA-Seq by expectation maximization)'* (Supplementary Table 2) 
and discarded genes that were not appreciably expressed (transcripts 
per million (TPM) > 1) in at least three individual cells, retaining 
6,313 genes for further analysis. 

Although the gene expression levels of population replicates were 
tightly correlated with one another (Pearson r > 0.98, log-scale; 
Fig. 1a), there were substantial differences in expression between indi- 
vidual cells (0.29 < r < 0.62, mean: 0.48; Fig. 1b and Supplementary 
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Figure 1 | Single-cell RNA-Seq of LPS-stimulated BMDCs reveals extensive 
transcriptome heterogeneity. a—c, Correlations of transcript expression levels 
(x and y-axes: log-scale TPM + 1) between two 10,000-cell population 
replicates (rep.) (a), two single cells (S1 and S2) (b), and the ‘average’ single cell 
and a population (c). d, e, RNA-Seq read densities in single cells (blue) and 
population replicates (grey) for three non-variable genes (d) and four variable 
ones (e). f, g, RNA-FISH of representative transcripts. Optical micrographs 
(cell boundaries; grey outlines) and maximum-normalized distributions of 
expression levels from a RNA-FISH co-staining (n = 3,193 cells) for 116 
(yellow) and Cxcl1 (magenta). Scale bars, 25 um. 
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Fig. 1). Despite this extensive cell-to-cell variation, expression levels for 
an ‘average’ single cell correlated well with the population samples 
(0.79 < r < 0.81; Fig. 1c and Supplementary Fig. 1). 

We used RNA-fluorescence in situ hybridization (RNA-FISH), an 
amplification-free imaging technique’, to verify that heterogeneity in 
our single-cell expression data reflected true biological differences, 
rather than technical noise associated with the amplification of small 
amounts of cellular RNA. For 25 genes, selected to cover a wide range 
of expression levels, the variation in gene expression detected by RNA- 
FISH closely mirrored the heterogeneity observed in our sequencing 
data (Fig. 1d-g and Supplementary Fig. 2). For example, expression of 
housekeeping genes (such as f-actin (Actb) and B2-microglobulin 
(B2m)) matched a log-normal distribution in both single-cell RNA- 
Seq and RNA-FISH measurements, consistent with previous studies’. 
By contrast, many genes involved in the LPS response, although highly 
expressed on average, showed significantly greater levels of heterogen- 
eity, with expression levels deviating ~1,000-fold between individual 
cells in extreme cases (Fig. le-g). 

More generally, we observed that single-cell variability existed across 
a wide range of population expression levels (Fig. 2a). Of the 522 most 
highly expressed genes (single-cell average TPM > 250; Fig. 2a, 
unshaded region, and Supplementary Table 3), 281 had low cell-to-cell 
variability (coefficient of variation (CV, o/) < 0.25; Supplementary 
Information) and were well described by log-normal distributions 
(RNA-Seq: Fig. 2b, c, top, RNA-FISH (Actb, B2m): Supplementary 
Fig. 2). These 281 genes were enriched for housekeeping genes, encod- 
ing ribosomal and other structural proteins (Supplementary Tables 2 
and 3; Bonferroni-corrected P = 1.5 X 10 °), consistent with previous 
findings in yeast’? and mammalian cells’. 

Notably, however, 185 of the remaining 241 (CV > 0.25; Sup- 
plementary Information) highly expressed genes had bimodal expres- 
sion patterns (Fig. 2b, c, bottom): mRNA levels for these genes were 
high in many of the cells, but were at least an order of magnitude lower 
(often very low or undetectable) than the single-cell average in three or 
more cells. We independently verified this disparity by RNA-FISH (for 
example, Cxcll, Cxcl10 and Ifit1; Fig. 1f, g and Supplementary Fig. 2), 
confirming that it was not a result of technical noise. This variable set 
included both antiviral and inflammatory response genes, and was 
highly enriched for genes in which expression was increased by at 
least twofold after LPS stimulation at the population level'® (P= 
2.7X 1077; hypergeometric test; Supplementary Table 2). Still, bimo- 
dal expression was not a universal feature of immune response trans- 
cripts; some key chemokines and chemokine receptors (Ccl3, Ccl4 and 


= LPS response 


a Housekeeping 


Single cell variability () 
Single cell variability (o) 


0 2 4 6 8 10 0 2 4 6 
Average expression in single cells (1) 


Figure 2 | Bimodal variation in expression levels across single cells. 

a, Relationship between average expression level in single cells (41, x axis) and 
standard deviation (a, y axis) for 6,313 genes (Supplementary Table 2). Blue 
dashed line denotes maximum theoretical o for an average expression level 
(Supplementary Information). Grey dashed line denotes the constant 
coefficient of variation (CV, o/ = 0.25). Magenta represents immune response 
genes; green denotes housekeeping genes; light blue shaded region represents 
single-cell average TPM < 250. b, Cellular heterogeneity for the 522 most 
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Ccrl2), cytokines (Cxcl2), and signalling molecules (Tank) were highly 
expressed in every cell (Supplementary Fig. 3), indicating that all cells 
were indeed activated by LPS. 

This degree of variation in expression for highly expressed (on 
average) transcripts has not been observed in previous reports’’°. 
For example, examination of published single-cell RNA-Seq data sets 
of human embryonic stem cells’ (Fig. 2a), mouse embryonic stem cells, 
and terminally differentiated fibroblasts’? (Supplementary Fig. 4) 
revealed far less heterogeneity in expression for highly abundant 
(population average) genes. Similarly, studies of protein expression 
in mid-log yeast cells and dividing human cell lines’*"” did not find 
such bimodality in (on average) highly expressed genes. We thus pro- 
posed that widespread variability in single-cell gene expression may 
reflect functionally important differences in the stimulated BMDC 
population. 

Furthermore, we found that splicing patterns also showed previ- 
ously unobserved levels of heterogeneity across single cells. Specifi- 
cally, for genes that have multiple splice isoforms at the population 
level, individual cells predominantly expressed one particular isoform. 
We calculated the frequency (percentage spliced in (PSI)) of previously 
annotated splicing events in each of our samples using MISO", a 
Bayesian framework for calculating isoform ratios (Supplementary 
Table 4). Although the population-derived estimates were highly 
reproducible, single cells exhibited significant variability in their 
exon-inclusion frequencies (Fig. 3a, b). 

We considered the possibility that PCR amplification (intrinsic to 
the library preparation process) could potentially produce an over- 
estimation of isoform regulation variability, particularly for weakly 
expressed transcripts'”. However, even when we limited our analysis 
to 89 alternatively spliced exons (0.2 < population PSI < 0.8) that were 
very highly expressed within a single cell (single cell TPM > 250; 
Supplementary Information), we still observed the same variability 
in splicing patterns among individual cells, with highly skewed 
expression towards a single splice variant (Fig. 3b). We obtained sim- 
ilar results when we generated three additional single-cell cDNA lib- 
raries using a slightly modified SMART-Seq protocol (Supplementary 
Information) in which a four-nucleotide barcode was introduced onto 
each RNA molecule during reverse transcription’, enabling us to 
estimate the number of unique RNA transcripts that existed before 
PCR (Supplementary Figs 5 and 6 and Supplementary Information). 

To the best of our knowledge, single-cell variation in splicing pat- 
terns has rarely been studied for individual genes, and never been 
analysed on a genomic scale. One recent report’? used RNA-FISH to 
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highly expressed genes (single-cell average; Supplementary Table 3). Each row 
represents a discretized histogram for a single gene (sorted by CV from low to 
high (top to bottom)). Colour represents the number of cells (yellow: 18 cells; 
black: 0) that express the gene at the noted level. Grey dashed line denotes the 
constant CV (0.25) highlighted in (a). c, Averaged expression density 
distributions for the 281 low-variability genes (top) and the 241 highly variable 
genes (bottom). 
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Figure 3 | Variation in isoform usage between single cells. a, RNA-Seq read 
densities in single cells (blue) and population replicates (grey) for two 
illustrative loci, each with two different isoforms (bottom). b, Distributions of 
exon inclusion (PSI scores, x axis) for alternatively spliced exons of highly 
expressed genes (single-cell TPM > 250) in individual cells (blue histogram, 
top) and populations (grey histogram, bottom). c, Left, RNA-Seq read densities 
for Irf7 (only cells in which the transcript is expressed are shown). Coloured 
boxes mark exons analysed by RNA-FISH. Right, RNA-FISH images from 


study variation in alternative isoform usage in two genes, and observed 
lower levels of isoform variability across single cells (the levels of 
heterogeneity differed in different cell types). Another study that used 
fluorescent reporters to quantify single-cell exon inclusion levels for 
one gene discovered highly variable and bimodal splicing patterns”'. 

To independently verify the existence of extensive differences in 
isoform ratios between cells, we designed RNA-FISH probes targeting 
constitutive and isoform-specific exons in two genes” (Irf7 and Acpp; 
Fig. 3c and Supplementary Figs 7 and 8). We found substantial 
expression variability in overall Irf7 levels between individual cells 
(as reflected by the ‘constitutive’ probes; Fig. 3c, top and bottom), 
mirroring our single-cell sequencing results (and further explored 
below). Furthermore, within each Irf7-expressing cell, we observed a 
bias towards either the inclusion or exclusion of the cassette exon 
(Fig. 3c and Supplementary Fig. 7, middle; for example, compare ‘high’ 
and ‘low’ marked cells). We obtained comparable results for Acpp 
using two probes designed to detect mutually exclusive alternative final 
exons (Supplementary Fig. 8). 

We next explored the sources and functional implications of 
expression variability. Bimodality among highly expressed immune 
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Figure 4 | Analysis of co-variation in single-cell mRNA expression levels 
reveals distinct maturity states and an antiviral cell circuit. a, Principal 
components analysis of 632 LPS-induced genes. Contributions of each cell 
(points) to the first two principal components (PC1 and PC2). b, Clustered 
correlation matrix of induced genes. Left, the Pearson correlation coefficients 
(r) between single-cell expression profiles of every pair of 632 LPS-induced 
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Constitutive A (magenta, M) 


Constitutive A (M) + B (C) 


simultaneous hybridization with probes for two constitutive (con.) regions of 
the transcript (A: cyan (C); B: magenta (M)) and one alternatively spliced exon 
(specific: orange (O)). White arrows (middle) highlight two cells with high 
levels of Irf7, but opposite preferences for the alternatively spliced exon. 
Histograms show global abundance ratios for isoform-specific and constitutive 
probes (cells with less than five constitutive counts have been excluded; n = 490 
cells; bottom histogram deviates from 0.5 owing to probe design; see 
Supplementary Information). Scale bars, 250 jum (left); 25 tum (right). 


response genes may reflect the presence of distinct cellular subtypes 
or stochastic differences in the activation of regulatory circuits'’. We 
performed a principal components analysis (Fig. 4a) on our single-cell 
expression profiles, focusing on the 632 genes that were induced at 
least twofold in the population-wide response to LPS'® (Supplemen- 
tary Table 5). We found two distinct subpopulations, clearly distin- 
guishable by the first principal component (PC1, 15% of the total 
variation; Fig. 4a). One group of fifteen cells expressed a core set of 
antiviral and inflammatory defence cytokines (including Tnf, Illa, 
Il1b and Cxcl10) at extremely high levels (TPM > 1,000), whereas 
the remaining three cells expressed them at far weaker levels 
(TPM < 50). Some cell surface proteins (Ccr7 and Cd83) and chemo- 
kines (Ccl22), which are known markers of BMDC maturation, 
showed the opposite expression pattern (Fig. 4b and Supplementary 
Fig. 9). 

During maturation, BMDCs switch from antigen-capturing to 
antigen-presenting cells that prime the adaptive immune system”. 
Maturation can occur either in response to pathogen-derived ligands 
(pathogen-dependent maturation), such as LPS, or when clusters of 
BMDCsare disrupted in culture” (pathogen-independent maturation). 


cy 
LL cs Wild type 
= + S42 eet CTEH Le 
= < Rage Antiviral 
— 1 7 
= 2 ie cluster 
— fo} Ate 
— Z “Ate Non-variable 
zi i response 
va 
—| € 
= SS 100 62 e-cceceon Antiviral 
= 10° 101 10? cluster 
Irf7 mRNA count (+1) 


Non-variable 


% |2=0.37 response 
10° 
3 & 
rs) 5 fe Antiviral 
S10 ee cluster 
$ | 
a ° Non-variable 
= E response 
7 N 10° 
PC score © 40° 10! 102 


Relative expression (log,) 
Stat2 mRNA count (+1) Oz a G 


4508 845 


genes (rows, columns). Right, the projection score (green: high; blue: low) for 
each gene (row) onto PC] (left) and PC2 (right). c, Confirmation of correlations 
for Irf7-Stat2 (n = 655 cells) and Irf7-Ifitl (n = 934 cells) by RNA-FISH. 
d-f, Expression levels for 16 genes in single BMDCs (columns), measured using 
single-cell qRT-PCR, in wild type (n = 36) (d), Irf7 (n = 47) (e) and 

Tfnr / ~ (n= 18) (f), at 4h after LPS stimulation (Supplementary Information). 
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Both processes lead to induction of maturation markers, but only 
pathogen-dependent maturation results in co-expression of defence 
cytokines. 

Examining the expression of maturation markers and defence cyto- 
kines (Supplementary Fig. 9) suggested that our 18 cells represent two 
distinct maturity states: (1) 15 cells that were in the early stages of 
pathogen-dependent maturation (Fig. 4a, ‘maturing’, triangles; grey 
triangles, the two cells furthest along in this process); and (2) three 
cells that probably matured during the culturing process (Fig. 4a, 
‘mature’, squares; pathogen-independent). We further verified the 
existence of these sub-populations via RNA-FISH (Supplemen- 
tary Fig. 10), single-cell quantitative reverse transcription PCR 
(qRT-PCR; Supplementary Fig. 11, Supplementary Information and 
Supplementary Table 6), and cell sorting based on surface markers 
identified from the RNA-Seq data (Supplementary Fig. 12 and 
Supplementary Information). These results highlight that single-cell 
RNA-Seq can sensitively distinguish between closely related, yet dis- 
tinct, developmental states, even within the same cell type. 

Because differences in cell state explain only a small portion of the 
observed heterogeneity, we next examined the variation that might 
arise from the differential activity of regulatory circuits. We reasoned 
that co-variation across single cells between the mRNA levels of a 
transcription factor and its targets would represent a potential regula- 
tory interaction, and, furthermore, would suggest that heterogeneity in 
the regulator’s expression may underlie the variability of its targets. 
Such a correlative approach has successfully identified regulatory con- 
nections from population-level transcription profiles measured in dif- 
ferent conditions'*”’. Here, we attempted to apply it to several single 
cells in the same condition. 

To this end, we calculated the correlation in expression profiles 
between every pair of induced genes across all single cells, and iden- 
tified a cluster of 137 genes that varied in a correlated way and were 
strongly discriminated by the second principal component (PC2, 8% 
of the variation; Fig. 4a, b). The genes of this cluster included the 
known antiviral master regulators Irf7 and Stat2, and were highly 
enriched for members of the antiviral response’” (60 out of 137 genes, 
P=2.5X10 °, hypergeometric test; Supplementary Table 5), as well 
as STAT2 targets'® (73 out of 137 genes, P = 4.5 X 10°, hypergeo- 
metric test). Most (100 out of 137) of the cluster’s genes were bimodally 
expressed across single cells (Fig. 2c, bottom) despite being strongly 
expressed at the population level (13 genes TPM > 250; 53 genes TPM 
> 50). We independently validated a subset of these correlations using 
single-cell (RT-PCR and RNA-FISH (Fig. 4c, d). Moreover, single-cell 
qRT-PCR analysis of additional time points demonstrated that these 
correlations persisted at 6h as well (Supplementary Discussion and 
Supplementary Fig. 13). 

We hypothesized that bimodal variation in the expression of the 
cluster’s genes may be related to differences in the levels and activities 
of Stat2 and Irf7. To test this hypothesis, we measured expression of a 
set of antiviral genes by single-cell qRT-PCR in LPS-stimulated 
BMDCs from Irf7 knockout (Irf7 ~/-) mice (Supplementary Informa- 
tion). As expected, this perturbation ablated expression of most of 
the variable antiviral transcripts in our signature, while leaving non- 
variable antiviral transcripts relatively unaffected (Fig. 4e). However, 
Stat2 expression and variability levels were unaffected by the Irf7 
knockout, indicating that Stat2 may act either upstream or in parallel 
to Irf7 during the response** (Supplementary Fig. 14). As both Stat2 
and Irf7 are targets of the interferon-signalling pathway, we stimulated 
and profiled BMDCs from interferon receptor knockout (Ifnr ‘~) 
mice. In these cells, we found markedly reduced expression for both 
Stat2 and Irf7, as well as all other measured cluster genes (Fig. 4f). 

Our analysis provides a proof-of-concept demonstrating how co- 
variation between transcripts across seemingly homogeneous single 
cells can help to identify and assemble regulatory circuits. Specifically, 
in our variable circuit (Supplementary Fig. 14) interferon signalling is 
required for the induction of Stat2 and Irf7, which, in turn, act to 
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induce our variable antiviral cluster genes. Our experiments do not 
definitively determine, however, which component of the circuit 
causes the observed heterogeneity per se. One compelling possibility 
is that upstream noise is propagated from the interferon-signalling 
pathway first to Stat2 and Irf7 and then to the target genes**’®. This 
hypothesis is supported by the variation we observed in Stat1 and Stat2 
protein levels and nuclear localization (Supplementary Discussion and 
Supplementary Figs 15 and 16). However, because temporal snapshots 
of RNA and protein are not always directly comparable (Supplemen- 
tary Discussion and Supplementary Figs 15 and 16), new strategies for 
tracing the spatiotemporal dynamics of both proteins and RNA in 
single living cells are needed to fully test this hypothesis’’. 

A similar approach could potentially be used to explore the conse- 
quences of bimodality in splicing. Even looking at just 18 cells, we 
witnessed interesting examples of bimodal splicing patterns for genes 
whose isoforms have distinct functional consequences. For example, 
the splicing regulators Srsf3 and Srsf7 are each known to contain a 
‘Poison cassette exon’ that, when included, targets the RNA for degra- 
dation via nonsense-mediated decay’ (Supplementary Fig. 17). 
Meanwhile, splicing differences in other regulatory genes may further 
enhance expression diversity: for example, proteins encoded by differ- 
ent isoforms of Irf7 (Fig. 3c) differentially activate interferon-responsive 
genes in vitro**. These examples suggest that heterogeneity in splicing 
may represent another layer of response encoding. 

In conclusion, our study reveals extensive bimodality in the tran- 
scriptional response of BMDCs to LPS, reflected in gene expression, 
alternative splicing and regulatory circuit activity. Although some 
variation in expression reflects differences in developmental state, 
other bimodal patterns reflect the differential activity of an antiviral 
regulatory circuit in this temporal snapshot. These phenomena 
allowed us to treat each cell as a ‘perturbation system’ for reconstruct- 
ing cell circuits”*, even with relatively few cells. 

Moreover, our results demonstrate how co-variation across single 
cells can help dissect and refine gene modules that may be indistin- 
guishable in population-scale measurements. For instance, in a recent 
population-scale study'®, we identified a large cluster of 808 ‘late- 
induced’ LPS genes that was enriched for both maturation genes and 
Stat-regulated antiviral genes. These two subsets could not be sepa- 
rated by population-level expression profiles alone’’, but our single- 
cell data from a single time point clearly distinguishes them. Similarly, 
the unexpected and prevalent skewing we discovered in alternative 
splicing between single cells revises our molecular view of this process. 
Furthermore, although many of our analyses focused on highly 
expressed genes to reduce the potential influence of amplification 
noise, our data also revealed substantial bimodality among more 
moderately expressed transcripts, such as large non-coding RNAs 
(lincRNAs; Supplementary Fig. 18). This suggests that the low popu- 
lation-level expression of these transcripts” may sometimes reflect 
high expression in a small subset of cells as opposed to uniform levels 
of low expression. Although further technical improvements will be 
necessary to disentangle these two hypotheses (Supplementary Fig. 5), 
single-cell measurements should help to facilitate the discovery and 
annotation of lincRNAs. 

Comparing our results to other single-cell RNA-Seq data sets (for 
example, Fig. 2a and Supplementary Fig. 4) indicates that the source of 
the analysed tissue (in vitro versus ex vivo), the biological condition of 
the individual cells (steady state versus dynamically responding), and 
the cellular microenvironment all probably influence the extent of 
single-cell heterogeneity within a system. When applied to complex 
tissues—such as unsorted bone marrow, developing embryos, tumours 
and other rare clinical samples—the variability seen through single- 
cell genomics may help to determine new cell classification schemes, 
identify transitional states, discover previously unrecognized bio- 
logical distinctions, and map markers that differentiate them. Ful- 
filling this potential would require new strategies to address the high 
levels of noise inherent in single-cell genomics—both technical, owing 
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to minute amounts of input material, and biological, for example, 
owing to short bursts of RNA transcription*’. Future studies that cou- 
ple technological advances in experimental preparation with new com- 
putational approaches would enable analyses, based on hundreds or 
thousands of single cells, to reconstruct intracellular circuits, enumer- 
ate and redefine cell states and types, and transform our understanding 
of cellular decision-making on a genomic scale. 


METHODS SUMMARY 


BMDCs, prepared as previously described’’, were stimulated with LPS for 4h and 
then sorted as single cells or populations (10,000 cells) directly into TCL lysis 
buffer (Qiagen) supplemented with 1% (v/v) 2-mercaptoethanol. After perform- 
ing a 2.2X clean up with Agencourt RNAClean XP Beads (Beckman Coulter), 
whole transcriptome-amplified cDNA products were generated using the 
SMARTer Ultra-low RNA kit (Clontech), and conventional Illumina libraries 
were made and sequenced to an average depth of 27 million read pairs (HiSeq 
2000, Illumina). Expression levels and splicing ratios were quantified using 
RSEM“ and MISO’, respectively. Additional experiments were performed using 
RNA-FISH (Panomics), immunofluorescence, FACS and single-cell qRT-PCR 
(Single Cell-to-CT (Invitrogen) and BioMark (Fludigm)). Full Methods and any 
associated references are provided in the Supplementary Information. 
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MBNL proteins repress ES-cell-specific alternative 
splicing and reprogramming 


Hong Han!?*, Manuel Irimia', P. Joel Ross?, Hoon-Ki Sung", Babak Alipanahi’, Laurent David®, Azadeh Golipour”®, 
Mathieu Gabut!, Iacovos P. Michael*, Emil N. Nachman!”, Eric Wang’, Dan Trcka®, Tadeo Thompson’, Dave O’Hanlon', 
Valentina Slobodeniuc!, Nuno L. Barbosa-Morais)®, Christopher B. Burge’, Jason Moffat’, Brendan J. Frey’, Andras Nagy*”, 


James Ellis”, Jeffrey L. Wrana”® & Benjamin J. Blencowe!* 


Previous investigations of the core gene regulatory circuitry that 
controls the pluripotency of embryonic stem (ES) cells have largely 
focused on the roles of transcription, chromatin and non-coding 
RNA regulators'*. Alternative splicing represents a widely acting 
mode of gene regulation**, yet its role in regulating ES-cell pluri- 
potency and differentiation is poorly understood. Here we identify 
the muscleblind-like RNA binding proteins, MBNL1 and MBNL2, 
as conserved and direct negative regulators of a large program of 
cassette exon alternative splicing events that are differentially regu- 
lated between ES cells and other cell types. Knockdown of MBNL 
proteins in differentiated cells causes switching to an ES-cell-like 
alternative splicing pattern for approximately half of these events, 
whereas overexpression of MBNL proteins in ES cells promotes 
differentiated-cell-like alternative splicing patterns. Among the 
MBNL-regulated events is an ES-cell-specific alternative splicing 
switch in the forkhead family transcription factor FOXP1 that con- 
trols pluripotency’. Consistent with a central and negative regula- 
tory role for MBNL proteins in pluripotency, their knockdown 
significantly enhances the expression of key pluripotency genes 
and the formation of induced pluripotent stem cells during somatic 
cell reprogramming. 

A core set of transcription factors that includes OCT4 (also called 
POUS5F1), NANOG and SOX2, together with specific microRNAs and 
long non-coding RNAs, control the expression of genes required for 
the establishment and maintenance of ES-cell pluripotency’*'°’. 
Alternative splicing, the process by which splice sites in primary tran- 
scripts are differentially selected to produce structurally and functionally 
distinct messenger RNA and protein isoforms, provides a powerful 
additional mechanism with which to control cell fate”*”’, yet its role 
in the regulation of pluripotency has only recently begun to emerge. In 
particular, the inclusion of a highly conserved ES-cell-specific ‘switch’ 
exon in the FOXP1 transcription factor changes its DNA binding spe- 
cificity such that it stimulates the expression of pluripotency transcrip- 
tion factors, including OCT4 and NANOG, while repressing genes 
required for differentiation’. However, the trans-acting regulators of this 
and other alternative splicing events'*’° implicated in ES-cell biology 
are not known. These factors are important to identify, as they may 
control regulatory cascades that direct cell fate, and likewise they may 
also control the efficiency and kinetics of somatic cell reprogramming. 

To identify such factors, we used high-throughput RNA sequencing 
(RNA-seq) data to define human and mouse cassette alternative exons 
that are differentially spliced between ES cells and induced pluripotent 
stem cells (iPSCs), and diverse differentiated cells and tissues, referred 
to below as ‘ES-cell-differential alternative splicing’. A splicing code 


analysis’” was then performed to identify cis-elements that may promote 
or repress these exons. The RNA-seq data used to profile alternative 
splicing were also used to detect human and mouse splicing factor 
genes that are differentially expressed between ES cells/iPSCs and 
non-ES cells/tissues. By integrating these data sources, we sought to 
identify differentially expressed splicing regulators with defined bind- 
ing sites that match cis-elements predicted by the code analysis to 
function in ES-cell-differential alternative splicing. 

We identified 181 human and 103 mouse ES-cell-differential alter- 
native splicing events, with comparable proportions of exons that 
are =25% more included or more skipped in ES cells versus the other 
profiled cells and tissues (Fig. la, Supplementary Figs la and 2, and 
Supplementary Tables 1 and 2). When comparing orthologous exons 
in both species, 25 of the human and mouse ES-cell-differential alter- 
native splicing events overlapped (P< 2.2 X 10 '°; hypergeometric 
test). The human and mouse ES-cell-differential alternative splicing 
events are significantly enriched in genes associated with the cytoske- 
leton (for example, DST, ADD3), plasma membrane (for example, 
DNM2, ITGA6) and kinase activity (for example, CASK, MARK2 
and MAP2K7) (Supplementary Table 3). They also include the afore- 
mentioned FOXP1 ES-cell-switch alternative splicing event, and prev- 
iously unknown alternative splicing events in other transcription or 
chromatin regulatory factor genes (for example, TEAD1 and MTA1) 
that have been implicated in controlling pluripotency'*””. These results 
suggest a considerably more extensive role for regulated alternative 
splicing in ES-cell biology than previously appreciated. 

The splicing code analysis revealed that motifs corresponding to 
consensus binding sites of the conserved MBNL proteins are the most 
strongly associated with ES-cell-differential alternative splicing in 
human and mouse. The presence of MBNL motifs in downstream 
flanking intronic sequences is associated with exon skipping in ES 
cells, whereas their presence in upstream flanking intronic sequences 
is associated with exon inclusion in ES cells (Fig. 1b, human code; 
Supplementary Fig. 1b, mouse code). To a lesser extent, features 
resembling binding sites for other splicing regulators, including poly- 
pyrimidine tract binding protein (PTBP) and RNA-binding fox (RBFOX) 
proteins, may also be associated with ES-cell-differential alternative 
splicing. 

From RNA-seq expression profiling of 221 known or putative 
splicing factors, 11 genes showed significant differential expression 
between human ES cells/iPSCs and other cells and tissues (Bonferroni- 
corrected P < 0.05, Wilcoxon rank-sum test) (Fig. 1c and Supplemen- 
tary Table 4). Notably, MBNL1 and MBNL2 had the lowest relative 
mRNA levels in ES cells/iPSCs compared to other cells and tissues 
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Figure 1 | Identification of regulators of ES-cell-differential alternative 
splicing. a, Heat map of per cent spliced in (PSI) values for 95 representative 
ES-cell-differential alternative splicing events in transcripts that are widely 
expressed across human ES cells/iPSCs, non-ES-cell lines and differentiated 
tissues. b, Splicing code features that are significantly associated with ES-cell- 
differential alternative splicing. Features are ranked according to Pearson 
correlation P values (y axis) for alternative exons with either lower (top) or 


(Fig. 1c, Supplementary Fig. 3a and Methods). Quantitative RT-PCR 
(polymerase chain reaction with reverse transcription; qRT-PCR) 
assays confirmed this observation (Supplementary Fig. 3b). Similar 
results were obtained when analysing mouse expression data (Supplemen- 
tary Fig. 3c-e and Supplementary Table 4). PTBP, RBFOX and other 
splicing factors potentially associated with ES-cell-differential alterna- 
tive splicing by the splicing code analysis did not exhibit significant 
differences in mRNA levels between ES cells/iPSCs and other cells or 
tissues. Collectively, these results suggest a conserved and prominent 
role for MBNL1 and MBNL2 in ES-cell-differential alternative splicing. 

Because MBNL proteins are expressed at minimal levels in ES cells 
compared to other cell types, we proposed that they may repress ES- 
cell-differential exons in non-ES cells, and/or activate the inclusion of 
exons in non-ES cells that are skipped in ES cells. Indeed, previous 
studies have shown that in differentiated cells, MBNL proteins suppress 
exon inclusion when they bind upstream flanking intronic sequences, 
and they promote inclusion when binding to downstream flanking 
intronic sequences””’. The results of the splicing code analysis are 
consistent with this mode of regulation, when taking into account that 
MBNL proteins are depleted in ES cells relative to differentiated cells 
and tissues (Fig. 1b and Supplementary Fig. 1b). 

To test the above hypothesis, we used short interfering RNAs 
(siRNAs) to knock down MBNLI1 and MBNL2 (to ~10% of their 
endogenous levels), individually or together, in human (293T and 
HeLa) and mouse (neuro2A (N2A)) cells (Fig. 2a and Supplementary 
Fig. 4a; see below). For comparison, knockdowns were performed in 
human (H9) and mouse (CGR8) ES cells. RT-PCR assays were used to 
monitor the ES-cell-switch exon of FOXP1/Foxp1 (human exon 18b/ 
mouse exon 16b), which is partially included in ES cells and fully 
skipped in differentiated cell types”. The splicing code analysis suggested 
that this exon is associated with conserved regulation by MBNL proteins, 
through possible direct disruption of splice-site recognition (Fig. 2b; 
see legend and below). Knockdown of MBNL2 in 293T or HeLa cells 
resulted in a<1% increase in FOXP1 exon 18b inclusion, whereas 
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compared to non-ES cells/tissues are shown. cRPKM, corrected reads per 
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Figure 2 | MBNL proteins regulate ES-cell-specific alternative splicing. 

a, Western blots confirming efficient knockdown of MBNL1 and MBNL2 
proteins in human 293T cells transfected with siRNA pools targeting these 
factors (siMBNL1+ 2, lane 6). Lane 5, lysate from cells transfected with a non- 
targeting siRNA pool (siControl). Lanes 1-4, serial dilutions (1:1, 1:2, 1:4 and 
1:8) of lysate from cells transfected with siControl. b, Splicing code map 
highlighting genomic locations of MBNL, RBFOX and PTBP motifs associated 
with ES-cell-specific alternative splicing of FOXP1/Foxp1 exon 18b/16b, the 
inclusion of which forms the FOXP1-ES/Foxp1-ES isoform. Human (black), 
mouse (grey) or conserved features (red) are indicated. Note that conserved 
MBNL motifs are associated with possible direct interference of exon 18b/16b 
splice site regulation. c, RT-PCR assays monitoring mRNA levels of FOXP1 
canonical (blue exon) and FOXP1-ES (red exon) isoforms in 293T cells 
transfected with siControl, siMBNL1, siMBNL2 or siMBNL1+ 2. RT-PCR used 
splice-junction-specific primers, as indicated. Expression levels of actin are 
shown as loading controls. d, mRNA levels of murine Foxp1-canonical and 
Foxp1-ES isoforms were assayed as in c in N2A cells. Expression levels of Gapdh 
are shown as loading controls. 
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knockdown of MBNLI alone, or together with MBNL2, resulted in 
increases in PSI (per cent spliced in), from zero to 2-2.4% and 6.5- 
7.1%, respectively (Figs 2c and 3c and Supplementary Figs 4b and 5). 
More pronounced effects were observed for Foxp1 exon 16b in N2A 
cells (PSI shift from 0 to 15.1 for the double knockdown; Fig. 2d and 
Supplementary Fig. 4c). Knockdowns in ES cells had modest effects on 
exon 18b/16b splicing, consistent with the low levels of MBNL express- 
ion in these cells (Supplementary Fig. 4d, e). Knockdown of a third 
MBNL family member, MBNL3, which has a more restricted cell-type 
distribution compared to MBNLI1 and MBNL2 (ref. 22), had no detect- 
able effect on exon 18b splicing (Supplementary Fig. 5). These results 
suggest that MBNL1 and MBNL2 proteins have conserved and par- 
tially redundant roles in the negative regulation of FOXP1/Foxp1 exon 
18b/16b inclusion. 


a Exons affected by c > ei D 
i Sa 
siMBNL1+2 Ss » & 
xs x & 
FOXP1 ~~ = 
Cz 
0 7.1 45.2 PSI (FOxP1-Es) 
| eee 
MAP3KkK4 | 


ES-cell-differential 
exons 


13.0 49.0 100 PSI 


i cm 
b APSI (ES-cell - 499 95.6 57.3 0 PSI 
i 
diff. cells) 804 TAR —_ = | = 
60 -— oo 
P<2.2x 10-16 E 
Seam 40 99.0 57.4 25.9 PSI 
PLEKH2 | "| 5) CE 
-80 -60 -40 -20 20 40 60 80 — | ee 
APSI 70.7 37.6 27.1 PSI 
i = = 
(siMBNL 142 SLK (c-) 
— siControl) eee | OO 
ia 41 48 70.2 PSI 
Actin — | |] 
d e 
70 2.5 
se 
= & 60 2.0] ES-cell-excluded exons 
2p Bis 
@ 2 50 e 1.0 
a5 2 
mo 40 oS 0.5 
=e) o 
52 8 00 /~ a 
% & 30 5 05 ; Ww wy 
$20 & 1.0 
64 
eZ 215 
5 0 10 
= 2 2.0 ES-cell-included exons 
0 : : ‘ : 
ES Non- _ iN iN 


cell ES cell OV OF 
Distance to nearest splice site (nt) 


Figure 3 | MBNL proteins regulate approximately half of ES-cell- 
differential alternative splicing events. a, Venn diagram showing the 
proportion of ES-cell-differential alternative splicing events (green) that 
display =15 PSI change between HeLa cells transfected with siRNA pools 
targeting MBNL1 and MBNL2 (siMBNL1+2) versus siControl pool (orange). 
b, High association (P< 2.2 X 10 — 16 one-sided binomial test between 
quadrants) between differences in PSIs of ES cells versus differentiated cells/ 
tissues, and differences in PSIs of ssiMBNL1+2 knockdown versus siControl 
treatments. c, Representative RT-PCR validations for ES-cell-differential 
alternative splicing events that have PSI changes in HeLa cells after 

siMBNL1 +2 transfection and for ES-cell-differential alternative splicing events 
that do not change upon siMBNL1+2 knockdown (c-); splicing patterns in 
human H9 ES cells are shown for comparison. d, Percentage of alternative 
splicing events with overlapping MBNL1 CLIP-seq binding clusters” in C2C12 
cells for ES-cell-differential or non-ES-cell-regulated alternative exons. 

e, Merged map of MBNL binding clusters in transcripts with ES-cell- 
differential alternative splicing events. Maps of MBNL1 binding sites with 
respect to exons that have higher or lower inclusion in ES cells/iPSCs, relative to 
non-ES cells/tissues. 
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To assess the extent to which ES-cell-differential alternative splicing 
events are controlled by MBNL proteins, MBNLI and MBNL2 were 
knocked down in HeLa cells, and RNA-seq profiling was used to detect 
alternative splicing changes (Fig. 3). Of 119 profiled ES-cell-differentially 
spliced exons, nearly halfare affected by knockdown of MBNL proteins, 
with a=15 PSI change towards an ES-cell-like alternative splicing 
pattern (Fig. 3a). A strong overall association (P< 2.2 x 101°, one- 
sided binomial test) was observed when comparing PSI changes for 
exons differentially spliced between ES cells and non-ES cells/tissues, 
and PSI changes for the same exons following knockdown of MBNL 
proteins (Fig. 3b). RT-PCR experiments confirmed all analysed MBNL 
knockdown-dependent and -independent PSI changes (Fig. 3c and 
Supplementary Fig. 6a). The specificity of the knockdown experiments 
was further demonstrated by comparing individual siRNAs that target 
different sequences within MBNL1 transcripts (Supplementary Fig. 6b). 
Comparable results were observed when MBNL1 and MBN12 proteins 
were simultaneously knocked down in 293T cells, and in undifferenti- 
ated C2C12 mouse myoblast cells (Supplementary Fig. 7). Conversely, 
overexpression of MBNL1 and MBNL2 proteins in mouse ES cells 
promoted differentiated-cell-like patterns for all analysed ES-cell- 
differential alternative splicing events (Supplementary Fig. 8), includ- 
ing a switch to the exclusive use of the canonical (that is, non-ES cell) 
exon 16 in Foxp1 transcripts. Consistent with this observation, over- 
expression of MBNL proteins in ES cells also led to increased kinetics of 
silencing of core pluripotency factors upon differentiation, and further 
promoted the expression of specific lineage markers representative of 
all three germ layers (Supplementary Fig. 9). 

Mapping of MBNL protein binding to endogenous transcripts using 
ultraviolet crosslinking coupled to immunoprecipitation and sequen- 
cing (CLIP-seq or HITS-CLIP”’) in undifferentiated C2C12 myoblast 
cells” confirmed that these proteins directly target ES-cell-differential 
alternative splicing events, including Foxp1 exon 16b (Fig. 3d and Sup- 
plementary Fig. 10a). Of 57 mouse ES-cell-differential exons expressed 
in C2C12 cells, ~34 (60%) are associated with overlapping or proximal 
clusters of MBNL CLIP-seq tags (‘binding clusters’), whereas binding 
clusters are associated with 72 out of 601 (12%) of exons that are not 
differentially regulated in ES cells (P< 2.2 X 101°, proportion test; 
Fig. 3d). The binding clusters associated with ES-cell-differential 
alternative splicing are significantly enriched in consensus binding 
sites for MBNL proteins (Supplementary Fig. 10b)*°*!”*. Moreover, 
consistent with the splicing code analysis (Fig. 1b and Supplemen- 
tary Fig. 1b) and previous results’, the locations of MBNL binding 
clusters correlate with whether the target exons are more or less 
included in ES cells compared to other cells and tissues (Fig. 3e). 
Collectively, the results so far demonstrate that MBNL proteins act 
widely and directly to regulate ES-cell-differential alternative splicing, 
and consequently pluripotency factor expression. 

We next asked whether MBNL proteins have an impact on somatic 
cell reprogramming (Fig. 4a). Secondary mouse embryonic fibroblasts 
(MEFs)* expressing the ‘OKSM’ transcription factors (OCT4, KLF4, 
SOX2, c-MYC)*® from transgenes under doxycycline-inducible con- 
trol were transfected with siRNA pools to knock down MBNLI and 
MBNI2 (siMbnl1+2), or with a control, non-targeting siRNA pool 
(siControl). At days 3 and 5 after doxycycline induction, mRNA 
expression of endogenous pluripotency genes, including Oct4, 
Nanog, Sall4 and Alpl, were assayed by qRT-PCR (Fig. 4b and 
Supplementary Fig. 1la). None of these genes displayed significant 
changes in expression at day 3 (Fig. 4a, b); however, at day 5, MBNL 
knockdown stimulated their expression by approximately twofold over 
the siControl treatment (Fig. 4b and Supplementary Fig. 11a). MBNL 
knockdown also resulted in a ~30% increase in the colony area immu- 
nostained for SSEA1, a pluripotency-associated marker expressed early 
during reprogramming (Fig. 4c). In contrast, knockdown of OCT4 
(siOct4) resulted in significant reductions in endogenous pluripotency 
gene expression and in SSEA1-positive colonies (Fig. 4b, c). 
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Figure 4 | Knockdown of MBNL proteins enhances reprogramming 
efficiency and kinetics. a, Experimental scheme. b, (RT-PCR quantification of 
mRNA expression levels of endogenous Oct4 and Nanog (data for additional 
genes in Supplementary Fig. 11a). Secondary MEFs were transfected with 
control siRNAs (siControl), siRNAs targeting Mbnl1 and Mbni2 (siMbnl1+2) or 
Oct4 (siOct4) and treated with doxycycline (Dox) for 3 days (blue bars) or 5 days 
(red bars) before analysis. Empty bars, secondary MEFs without doxycycline 
induction. Values represent means + range (n = 3). c, Top: quantification of 
SSEA1-stained area change relative to siControl at day 5 after doxycycline 
induction; values represent means + range (n = 3). Bottom: representative 
images of SSEA1 staining. Scale bar, 100 tm. d, Top: quantification of 
doxycycline-independent iPSC colony formation. Secondary MEFs were treated 
with doxycycline for 8 days followed by 5 days of doxycycline withdrawal and 
counting of alkaline-phosphatase-positive colonies. Bottom: representative 
images of alkaline phosphatase staining. e, Teratoma assay assessing the 
pluripotency potential of iPSCs derived from secondary MEFs after knockdown 
of MBNL proteins. Haematoxylin and eosin staining, with additional staining/ 
immunolabelling using periodic acid-Schiff (PAS; for detection of glycogen or 
glycoprotein producing cells), safranin O (SafO; for detection of cartilage), or 


Successful reprogramming requires that cells undergo a transition 
toan OKSM transgene-independent state’’. We therefore asked whether 
suppression of MBNL proteins promotes transgene independence. 
OKSM transgenes were induced with doxycycline for 8 days, then the 
cells were cultured for 5 days without doxycycline (Fig. 4a). Whereas 
knockdown of OCT4 reduced colony formation, knockdown of MBNL 
proteins resulted in an approximate twofold increase in transgene- 
independent colonies, as detected by alkaline phosphatase staining 
(P = 0.0004; one-sided t-test) (Fig. 4d and Supplementary Fig. 11b). 
iPSC lines derived from transgene-independent colonies after MBNL 
knockdown were pluripotent and contributed to all three germ layers 
in both teratoma and chimaera assays (Fig. 4e and Supplementary 
Figs 11-13). Consistent with these results, Mbnl expression is significantly 
reduced in secondary MEF clones” cultured in the presence of doxycy- 
cline that are competent to achieve transgene independence (when 
doxycycline is removed) versus those that are not (P = 0.006; one-sided 
t-test) (Fig. 4f, left). Moreover, the PSI levels of ES-cell-differential 
alternative splicing events, including Foxp1 exon 16b, significantly 
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antibody to neuronal nuclear antigen (NeuN) is shown (see Supplementary Figs 
12 and 13 for additional teratoma analysis and chimaera testing of the 
pluripotency potential of siMbnl iPSCs). Scale bar, 100 um. f, Top: experimental 
scheme for clonal analysis. Upon doxycycline removal at day 21, clones derived 
from single cells either survive and form iPSCs (transgene independent) or do 
not survive (transgene dependent). Bottom: analysis of total Mbnl1/Mbnl2 
mRNA expression (left) and percentage of total PSI change (right) for Foxp1 
exon 16b in transgene-independent (red) and transgene-dependent (blue) 
clones, at day 21, where total PSI change is the PSI difference between MEFs and 
iPSCs during reprogramming. g, Quantification (by morphological 
examination) of human iPSC colonies formed by reprogramming BJ fibroblasts 
expressing shRNA targeting GFP (shGFP) or MBNL1 (shMBNL1). 

h, Immunostaining of human iPSCs derived from shMBNL-expressing BJ 
fibroblasts for TRA1-60, NANOG, SSEA4 and OCT4 pluripotency markers. 
Scale bar, 50 jum. See Supplementary Fig. 15 for additional characterization of 
human iPSCs. P values of one-sided t-tests shown for all comparisons in this 
figure. i, Model for the role of MBNL proteins in the regulation of ES-cell- 
differential alternative splicing, pluripotency and iPSC reprogramming. 
Asterisks indicate significantly enriched gene-function categories. 


correlate with ES cell/iPSC alternative splicing patterns only in those 
clones that are competent to transition to transgene independence 
(Fig. 4f, right; Supplementary Fig. 14 and Supplementary Table 5; 
r= 0.80, P= 3.2 X 10"). Notably, knockdown of MBNL1 in human 
fibroblasts expressing OKSM also resulted in an approximate twofold 
increase in the appearance of iPSC colonies (Fig. 4g, h and Supplemen- 
tary Fig. 15). MBNL proteins thus have a conserved, negative regula- 
tory role in somatic cell reprogramming. 

The results of this study reveal that MBNL proteins negatively regulate 
an ES-cell-differential alternative splicing network that controls pluri- 
potency and reprogramming (Fig. 4i). These proteins probably act, in 
part, by directly repressing the ES-cell-specific splicing switch in 
FOXP1, which promotes the expression of core pluripotency genes. 
However, additional genes with MBNL-regulated alternative splicing 
events have been linked to the control of pluripotency, indicating a 
more extensive role for the alternative splicing network in ES-cell 
biology (Fig. 4i). These observations represent the first evidence that 
trans-acting splicing regulators have a central role in the core circuitry 
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required for ES-cell pluripotency and reprogramming. Our results 
further offer a potential new approach for enhancing the production 
of iPSCs for research and therapeutic applications. 


METHODS SUMMARY 

siRNA knockdown and RNA analysis. Cells were transfected with SMART-pool 
siRNAs (Dharmacon) using DharmaFECT] reagent and collected 48 or 72h after 
transfection. Secondary MEFs were transfected with siRNA pools using Lipofectamine 
RNAiMAX (Invitrogen), as described previously**, and OKSM transgenes were 
induced 24h after with doxycycline. Semi-quantitative RT-PCR assays were per- 
formed using the OneStep RT-PCR kit (Qiagen), with modifications as described 
previously”. Quantitative RT-PCR assays were performed as previously 
described’. Primer sequences are available upon request. 

iPSC colony formation assays and characterization. Cells were transfected with 
siRNAs pools and treated with doxycycline for 5 days before imaging using an IN 
Cell Analyzer 2000 (GE Healthcare). To assay formation of doxycycline-independent 
colonies, secondary MEFs transfected with siRNA pools were treated with doxycy- 
cline for 8 days. Cell counting was performed before and at each passage after siRNA 
transfection. At day 8, the same number of cells were passaged in doxycycline-free 
media in 12-well plates and cultured until day 13, when they were fixed and stained 
with alkaline phosphatase for colony counting. For single-cell assays, secondary 
MEFs were induced by doxycycline treatment and clonal derivatives were cultured 
for 21 days. Removal of doxycycline at day 21 revealed alkaline-phosphatase- 
positive colonies (transgene-independent clones) and failed colonies (transgene- 
dependent clones). RNA-seq analysis was performed on three transgene-independent 
and five transgene-dependent clones at day 21 after doxycycline induction, as 
previously described”’. Details of human iPSC generation and characterization 
are available in the Methods. 

Teratoma and chimaera analysis. ES cells were injected subcutaneously into 
dorsal flanks of nude mice (CByJ.Cg-Foxn1nu/J) and resulting teratomas were 
analysed using immunohistochemistry or cell-specific staining 4 to 5 weeks after 
injection. Chimaera aggregation and whole-mount staining were performed as 
previously described”. 


Full Methods and any associated references are available in the online version of 
the paper. 


Received 12 September 2012; accepted 7 May 2013. 
Published online 5 June 2013. 


1. Young, R. A. Control of the embryonic stem cell state. Cel! 144, 940-954 (2011). 

2. Rinn,J.L. & Chang, H. Y. Genome regulation by long noncoding RNAs. Annu. Rev. 
Biochem. 81, 145-166 (2012). 

3. Bao, X. et al. MicroRNAs in somatic cell reprogramming. Curr. Opin. Cell Biol. 25, 
208-214 (2013). 

4. Pan,Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of alternative 
splicing complexity in the human transcriptome by high-throughput sequencing. 
Nature Genet. 40, 1413-1415 (2008). 

5. Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. 
Nature 456, 470-476 (2008). 

6. Braunschweig, U., Gueroussov, S., Plocik, A. M., Graveley, B. R. & Blencowe, B. J. 
Dynamic integration of splicing within gene regulatory pathways. Cel/ 152, 
1252-1269 (2013). 

7. Nilsen, T.W. & Graveley, B. R. Expansion of the eukaryotic proteome by alternative 
splicing. Nature 463, 457-463 (2010). 

8. Kalsotra, A. & Cooper, T. A. Functional consequences of developmentally regulated 
alternative splicing. Nature Rev. Genet. 12, 715-729 (2011). 

9. Gabut, M. et a/. An alternative splicing switch regulates embryonic stem cell 
pluripotency and reprogramming. Cell 147, 132-146 (2011). 

10. Chen, X. et al. Integration of external signaling pathways with the core 
transcriptional network in embryonic stem cells. Ce// 133, 1106-1117 (2008). 

11. Kim,J.,Chu, J., Shen, X., Wang, J. & Orkin, S. H. An extended transcriptional network 
for pluripotency of embryonic stem cells. Ce// 132, 1049-1061 (2008). 

12. Silva, J. et al. Nanog is the gateway to the pluripotent ground state. Ce// 138, 
722-737 (2009). 

13. lrimia, M. & Blencowe, B. J. Alternative splicing: decoding an expansive regulatory 
layer. Curr. Opin. Cell Biol, 24, 323-332 (2012). 


LETTER 


14. Rao, S. et al. Differential roles of Sall4 isoforms in embryonic stem cell 

pluripotency. Mol. Cell. Biol. 30, 5364-5380 (2010). 

15. Salomonis, N. et a/. Alternative splicing regulates mouse embryonic stem cell 

pluripotency and differentiation. Proc. Nat! Acad. Sci. USA 107, 10514-10519 

(2010). 

16. Mayshar, Y. et al. Fibroblast growth factor 4 and its novel splice isoform have 

opposing effects on the maintenance of human embryonic stem cell self-renewal. 

Stem Cells 26, 767-774 (2008). 

17. Barash, Y. et a/. Deciphering the splicing code. Nature 465, 53-59 (2010). 

18. Liang, J. et al. Nanog and Oct4 associate with unique transcriptional repression 

complexes in embryonic stem cells. Nature Cell Biol. 10, 731-739 (2008). 

19. Lian, |. et al. The role of YAP transcription coactivator in regulating stem cell self- 

renewal and differentiation. Genes Dev. 24, 1106-1118 (2010). 

20. Wang, E. T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA 

ocalization by muscleblind proteins. Ce// 150, 710-724 (2012). 

21. Charizanis, K. et a/. Muscleblind-like 2-mediated alternative splicing in the 
developing brain and dysregulation in myotonic dystrophy. Neuron 75, 437-450 
(2012). 

22. Pascual, M., Vicente, M., Monferrer, L. & Artero, R. The Muscleblind family of 
proteins: an emerging class of regulators of developmentally programmed 
alternative splicing. Differentiation 74, 65-80 (2006). 

23. Licatalosi, D. D. et a/. HITS-CLIP yields genome-wide insights into brain alternative 
RNA processing. Nature 456, 464-469 (2008). 

24. Fernandez-Costa, J. M., Llamusi, M. B., Garcia-Lopez, A. & Artero, R. Alternative 
splicing regulation by Muscleblind proteins: from development to disease. Biol. 
Rev. Camb. Philos. Soc. 86, 947-958 (2011). 

25. Woltjen, K. et al. piggyBac transposition reprograms fibroblasts to induced 
pluripotent stem cells. Nature 458, 766-770 (2009). 

26. Takahashi, K. et a/. Induction of pluripotent stem cells from adult human 
fibroblasts by defined factors. Cel/ 131, 861-872 (2007). 

27. Golipour, A. et al. A late transition in somatic cell reprogramming requires 
regulators distinct from the pluripotency network. Cell Stem Cell 11, 769-782 
(2012). 

28. Samavarchi-Tehrani, P. et a/. Functional genomics reveals a BMP-driven 
mesenchymal-to-epithelial transition in the initiation of somatic cell 
reprogramming. Cell Stem Cell 7, 64-77 (2010). 

29. Calarco, J. A. et al. Global analysis of alternative splicing differences between 
humans and chimpanzees. Genes Dev. 21, 2963-2975 (2007). 

30. Labbé, R. M. et a/. A comparative transcriptomic analysis reveals conserved 
features of stem cell pluripotency in planarians and mammals. Stem Cells 30, 
1734-1745 (2012). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements The authors thank U. Braunschweig, J. Ellis, S. Gueroussov and 
B. Raj for comments on the manuscript. We acknowledge D. Torti in the Donnelly 
Sequencing Centre for sequencing samples; L. Lee for assisting with the splicing code 
analysis; J. Garner (Hospital for Sick Children Embryonic Stem Cell Facility) for 
preparing feeder cells; A. Piekna for morphological examination of human iPSC 
colonies; M. Narimatsu for assisting with chimaerism analysis; and P. Mero for assisting 
with cell imaging. This work was supported by grants from the Canadian Institutes of 
Health Research (CIHR) (to B.J.B., J.L.W.,A.N., J.E. and B.J.F.), the Ontario Research Fund 
(to J.L.W., B.J.B., AN. and others), the Canadian Stem Cell Network (to A.N. and B.J.B.), 
and by a grant from the National Institutes of Health (R33MHO087908) to J.E. H.H. was 
supported by a University of Toronto Open Fellowship. PJ.R., M.I. and N.L.B.-M. were 
supported by postdoctoral fellowships from the Ontario Stem Cell Initiative, Human 
Frontiers Science Program Organization, and the Marie Curie Actions, respectively. 


Author Contributions H.H. performed experiments in Figs 1-4 and Supplementary 
Figs 2-9 and 11-13. MI. performed bioinformatic analyses in Figs 1-4 and 
Supplementary Figs 1,3, 7, 10 and 14, with input from N.L.B.-M. L.D. and A.G. assisted 
with secondary MEF reprogramming experiments and clone characterization, and D.T. 
generated secondary MEF lines and performed chimaerism testing. P.J.R., T.T.and M.G. 
performed human reprogramming experiments and iPSC characterization. H-K.S. 
performed teratoma assays. B.A. and B.J.F. generated splicing code data. I.P.M., H.-K.S. 
and D.O. assisted with ES-cell overexpression and differentiation experiments. E.W. and 
C.B.B. generated and analysed CLIP-seq data. E.N.N. and V.S. performed RT-PCR 
validation experiments. B.J.B., H.H. and MI. designed the study, with input from J.L.W., 
J.E., A.N. and J.M. B.J.B., H.H. and M.I. wrote the manuscript, with input from the other 
authors. 


Author Information GEO accession numbers are provided in Supplementary Table 1. 
Reprints and permissions information is available at www.nature.com/reprints. The 
authors declare no competing financial interests. Readers are welcome to comment on 
the online version of the paper. Correspondence and requests for materials should be 
addressed to B.J.B. (b.blencowe@utoronto.ca). 


13 JUNE 2013 | VOL 498 | NATURE | 245 


©2013 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

siRNA knockdown and RNA analysis. Cells were transfected with SMART-pool 
siRNAs (Dharmacon) using DharmaFECT!1 reagent and collected 48 or 72h 
after transfection. Secondary MEFs were transfected with siRNA pools using 
Lipofectamine RNAiMAX (Invitrogen), as described previously”’, and OKSM 
transgenes were induced 24h after with doxycycline. Semi-quantitative RT- 
PCR assays were performed using the OneStep RT-PCR kit (Qiagen), with modifi- 
cations as described previously”. Quantitative RT-PCR (qRT-PCR) assays were 
performed as previously described’. Primer sequences are available upon request. 
Cell lines and cell culture. HeLa, 293T and C2C12 cell lines were maintained in 
Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal 
bovine serum (FBS) and antibiotics (penicillin/streptomycin). Neuro2A (N2A) 
cells were grown in DMEM supplemented with 10% FBS, sodium pyruvate, MEM 
non-essential amino acids, and penicillin/streptomycin. H9 human ES cells, CGR8 
and R1 mouse ES cells were cultured as described previously*'. Secondary mouse 
embryonic fibroblasts (MEFs) were maintained in DMEM supplemented with 
10% FBS, L-glutamine and penicillin/streptomycin on 0.1% gelatin-coated plates. 
During reprogramming, secondary MEFs were grown in mouse ES media and 
induced to express OKMS factors using 1.5 4g ml’ of doxycycline as described 
previously**”*. 

Protein extraction and western blotting. Cell pellets were lysed in radio- 
immunoprecipitation assay (RIPA) buffer by brief sonication. Protein lysate (30- 
150 Lg) was separated ona 10% SDS-polyacrylamide gel and transferred to a PVDF 
membrane. The membranes were blotted with the following antibodies: anti-Flag 
M2 (1:1500, Sigma), anti- MBNL1 (1:500, Abcam), anti- MBNL2 (1:200, Santa Cruz 
Biotechnology) and anti-c-tubulin (1:5000, Sigma). Secondary antibodies (GE 
Healthcare) and chemiluminescence reagents (Perkin Elmer) were used as per 
the manufacturer’s instructions. 

RNA extraction and qRT-PCR assays. Total RNA was extracted using TRI 
Reagent (Sigma) or RNeasy columns (Qiagen). RT-PCR assays were performed 
using the OneStep RT-PCR kit (Qiagen), as per the manufacturer’s instructions. 
20 ng total RNA or 1 ng of polyA+ RNA was used per 10-1l reaction. Radiolabelled 
reactions contained 0.3 uCi of «-*’P-dCTP per 10-ul reaction. The number of 
amplification cycles was 22 for actin and Gapdh, and 27-32 for all other transcripts 
analysed. Reaction products were separated on 1-3% agarose gels. Quantification 
of isoform abundance was performed using either ImageQuant (GE Healthcare) or 
Image] software. To amplify the FOXP1/Foxp1 isoforms selectively (Fig. 2c, d), 
primers specific for splice junctions were used. 

For quantitative RT-PCR, first-strand cDNAs were generated from 1-3 1g of 
total RNA or 100 ng of polyA+ RNA using SuperScript III Reverse Transcriptase 
(Invitrogen), as per the manufacturer’s recommendations, and diluted to 20 ng ult 
and Ing ul ', respectively. qPCR reactions were performed in a 384-well format 
using 1 ul of each diluted cDNA and FastStart Universal SYBR Green Master 
(Roche Applied Science). All primers sequences are available upon request. 
Immunofluorescence. For immunofluorescence experiments, cells were fixed in 
4% PFA for 10 min at room temperature, washed with PBS, and permeabilized for 
10 min at 4 °C with 0.1% Triton X-100. After 1h of blocking, cells were incubated 
with primary antibodies overnight at 4 °C, and then with secondary antibodies for 
1 hat room temperature. Nuclei were stained with Hoechst 33258 (1:5,000, Sigma- 
Aldrich). Primary antibodies used in this study are: mouse IgM anti-SSEA1 (1:500, 
BD Biosciences), mouse anti-OCT4 (1:200, BD Biosciences), rabbit anti- NANOG 
(1:200, Cosmo Bio), goat anti-DPPA4 (1:250, R&D), mouse IgM anti-TRA1-60 
(1:100, Invitrogen), rabbit anti-NANOG (1:400, Cell Signaling), mouse anti- 
SSEA4 (1:100, Invitrogen), rabbit anti-OCT4 (1:200, Abcam), mouse anti-alpha 
fetoprotein (1:200, R&D), mouse anti-smooth muscle actin (1:200, Invitrogen), 
and mouse anti-beta-III-tubulin (1:200, Millipore). Secondary antibodies used in 
this study are: anti-mouse IgM Alexa555 (1:1000, Molecular Probes), anti-mouse 
IgG Alexa555 (1:1,000, Molecular Probes), anti-rabbit IgG Alexa594 (1:1,000, 
Molecular Probes), anti-rabbit IgG Alexa488 (1:500, Molecular Probes), and 
anti-goat IgG Alexa546 (1:1,000, Molecular Probes). 
iPSC colony-formation assays and imaging from secondary MEF reprogramming. 
Secondary MEFs were seeded in 12-well plates, transfected with siRNA pools, 
and treated with doxycycline for 5 days before fixing and staining. The plates 
were imaged (for both SSEA1-immunostained and DAPI channels) using an IN 
Cell Analyzer 2000 (GE Healthcare) with a <4 objective. For each well, 20 non- 
overlapping fields were captured and images were analysed using the Columbus 
System (PerkinElmer). A custom script was generated to identify SSEA1-positive 
and DAPI-positive colonies. The overall signal in each well was determined using 
the sum of the overlap area for the 20 fields captured. 

To assay the formation of doxycycline-independent colonies, secondary MEFs 
transfected with siRNA pools were treated with doxycycline for 8 days. Cell count- 
ing was performed before and at each passage after siRNA transfection and doub- 
ling rates were determined not to change significantly (data not shown). At day 8, 


the same number of cells were passaged into doxycycline-free mES-cell media on 
12-well plates and cultured until day 13, when they were fixed and stained with 
alkaline phosphatase for colony counting. 

Teratoma analysis. Cells were suspended in PBS and Matrigel (BD Bioscience) 
mixed solution, and 1 X 10° cells in 100 ul were injected subcutaneously into both 
dorsal flanks of nude mice (CByJ.Cg-Foxn1nu/J) anaesthetized with isoflurane. 
Four to five weeks after injection, mice were killed and teratomas were dissected, 
fixed overnight in 10% buffered formalin phosphate, and embedded in paraffin. 
Three-to-four-micrometre-thick sections were deparaffinized and hydrated in 
distilled water. Sections were stained either with haematoxylin and eosin for 
regular histological examination, or with the following dyes: 0.1% safranin O 
solution (cartilage, mesoderm-derived tissue) or 0.5% PAS solution (glycopro- 
tein-producing intestinal cell, endoderm-derived tissue). For immunohistoche- 
mistry, sections were deparaffinized and hydrated, and antigen retrieval process 
was performed. After blocking, sections were incubated overnight at 4°C with 
primary monoclonal antibody (1:100, Millipore MAB377, clone A60) specific for 
neuronal nuclear antigen (NeuN, ectoderm-derived tissue), followed by washing 
in PBS. After 1 h of incubation with secondary anti-mouse-HRP conjugated anti- 
body (1:500, Jackson ImmunoResearch, 115-035-003), signal was visualized by 
DAB (3,3’-diaminobenzidine; Vector Laboratories, SK-4100) substrate for 5-20 
min. Sections were counter-stained with haematoxylin. 

Chimaerism analysis. Chimaera aggregation and whole-mount staining were 
performed as described previously**. Chimeras were obtained through aggregation 
of siMbnl iPSC clumps with diploid Hsd:ICR(CD-1) embryos. E10.5 embryos 
were dissected after doxycycline treatment in utero via ingestion 24h before dis- 
section. After dissection, embryos were fixed with 0.25% glutaraldehyde, rinsed in 
wash buffer (2 mM MgCl, 0.01% sodium deoxycholate, and 0.02% Nonidet-P40 
in PBS) and then stained overnight in LacZ staining solution (20mM MgCl, 
5mM K3Fe(CN)6, 5mM K4Fe(CN)6 and 1 mg ml | X-gal in PBS). Embryos 
were embedded in paraffin, sectioned and counterstained with nuclear fast red. 
Generation and characterization of human iPSCs. Human BJ foreskin fibro- 
blasts (Stemgent) were reprogrammed using published protocols****, with the 
following modifications. BJ fibroblasts were first infected with lentivirus vectors 
encoding both a puromycin resistance gene and doxycycline-inducible shRNA 
targeting either GFP (negative control, target sequence: 5'-GCAAGCTGACCCT 
GAAGTTCAT-3’) or MBNL1 shRNA (target sequence: 5'-GCCTGCTTTGATT 
CATTGAAA-3’). Lentiviral vector preparations and infections were performed as 
described**. After selection with 1 jg ml~! puromycin, shRNA-encoding BJ fibro- 
blasts were infected with a second lentivirus vector (obtained from Addgene) co- 
expressing both the mouse retrovirus receptor mSlc7a1 and the blasticidin resistance 
gene’. During transient selection with 51gml’ blasticidin, puromycin was 
reduced to 0.5 tg ml! and maintained at this concentration for 6 days after infec- 
tion with retroviral reprogramming vectors. 

pMxs-based retrovirus vectors encoding the four reprogramming factors 
hOCT4, hSOX2, hKLF4, hCMYC (OSKM)”, were obtained from Addgene and 
packaged exactly as described*’. Puromycin/blasticidin-resistant BJ fibroblasts 
were infected in triplicate, using three separate preparations of retrovirus vectors. 
shRNA expression was induced by treatment with 2 jig ml’ doxycycline, which 
was initiated contemporaneously with retrovirus vector infection; control cells 
were treated with vehicle only. 

Six days after retrovirus infection, BJ fibroblasts were seeded on a monolayer of 
feeder cells. Embryonic day 12.5 fibroblasts from Tg(DR4)1Jae/J mice (Jackson 
Laboratory) were seeded on collagen-coated 6-well plates at a density of 3 X 10° 
cells per well as described”; retrovirus-infected BJ fibroblasts were seeded at a 
density of 2X 10* cells per well. At day 28 of reprogramming, quantification (by 
whole-well morphological examination and by TRA1-60 immunostaining) of 
human iPSC colonies was performed by investigators who were blinded as to 
the experimental conditions. To count TRA1-60-positive colonies, the plates were 
imaged (for both TRA1-60-immunostained and DAPI channels) using an IN Cell 
Analyzer 2000 (GE Healthcare) with a x4 objective. For each well, 64 non-overlapping 
fields were captured and images were analysed using the Columbus System 
(PerkinElmer). Knockdown of MBNL1 resulted in an approximate twofold 
increase in TRA1-60 immunostaining colonies over the control knockdown with 
GFP-targeting shRNA (data not shown). Additional OSKM retrovirus-infected BJ 
fibroblasts were seeded in parallel; individual colonies from doxycycline-treated 
plates were manually isolated 4 weeks after infection, and seeded on feeders in 
collagen-coated 24-well plates**. Cells from these colonies were expanded, and 
subsequently characterized (Supplementary Fig. 15) as described”. 

Clonal analysis by RNA-seq during reprogramming. In a single-cell assay, 
secondary MEFs were plated in individual wells of a 96-well plate, OKSM factors 
were induced by doxycycline treatment and the clonal derivatives were cultured for 
21 days. Removal of doxycycline at day 21 revealed that approximately 50% of clones 
produced abundant alkaline-phosphatase-positive colonies (transgene-independent 
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clones), whereas the rest yielded few or no colonies (transgene-dependent 
clones). RNA-seq analysis was performed for three transgene-independent and 
five transgene-dependent clones at day 21 after doxycycline induction (Sup- 
plementary Table 1). 

Using RNA-seq derived-PSI values (see Supplementary Methods), the inclusion 

levels of mouse ES-cell-differential cassette alternative exons were quantified for 
each of these clones. Fifty-one ES-cell-differential alternative splicing events with 
sufficient read coverage in all samples and with a=25 PSI difference between 
iPSCs and MEFs were compared between the two types of clones (Supplemen- 
tary Fig. 14 and Supplementary Table 5). 
RNA-seq data and analysis. We used RNA-seq data from 36 and 32 different 
human and mouse samples, respectively. Details and sample sources are provided 
in Supplementary Table 1. The samples comprise, for human: 5 ES-cell lines (3 
different cell lines and 2 replicates), 2 iPSC lines, 7 non-ES-cell lines, and 22 adult 
tissues (18 different tissues and 4 replicates); for mouse: 6 ES-cell lines, 2 iPSC 
lines, 8 non-ES-cell lines (5 different cell lines and 3 replicates) and 16 adult tissues 
(10 different types and 6 replicates). Details of RNA-seq analysis is available in 
Supplementary Information. 

To identify alternative exons differentially regulated in ES cells, we first calculated 

a single averaged PSI value for tissues of similar origin (see Supplementary Table 1). 
Only events with enough coverage in at least two ES-cell samples and three distinct 
tissue types were considered. ‘ES-cell-differential alternative splicing events’ were 
defined as those with a mean PSI difference of =25 between ES cells and differ- 
entiated tissues. To account for alternative splicing events potentially related to cell 
proliferation, we also required a mean PSI difference of = 25 between ES-cell lines 
and non-ES-cell lines, when the event had sufficient coverage in at least one cell line 
sample. The set of background alternative splicing events used throughout the 
study are alternatively spliced exons (defined here as exons with PSI values of 
<95% and >5% in at least one sample) that meet the same expression requirement 
(that is, in =2 ES cells and =3 differentiated tissue types) and that show an average 
difference in PSI level of <5% between ES cells and differentiated tissue samples, 
and between the ES-cell and non-ES-cell lines. 
Analyses of splicing factor expression. A total of 221 human and 214 mouse 
genes were selected for analysis based on literature mining for previously described 
splicing functions, ‘splicing’- and/or ‘spliceosome’-associated Gene Ontology (GO) 
terms, and/or the presence of a PFAM-annotated RNA-binding domain (Sup- 
plementary Table 4). To calculate the mRNA expression values for each sample, 
we used corrected (for mappability) reads per kilobase pair and million mapped 
reads values (CRPKMs) of the ‘stable’ (as defined by BioMart) Ensembl transcript 
for each gene, as previously described”. 

Splicing factor genes were ranked according to the relative extent of their 
differential expression (as determined by cRPKM values) in ES cells and iPSCs 
versus non-ES-cell lines and tissues by comparing summed ranks of each gene in 
all ES cell/iPSCs across the full range of samples. On the basis of this approach, 
human MBNLI and MBNL2 showed the first and second lowest overall rank in ES 
cells/iPSCs, respectively, and mouse Mbnll and Mbni2 showed the first and third 
lowest overall rank in ES cells/iPSCs, respectively. 

To assess the statistical significance of the differential expression of individual 

splicing factor genes, we compared their CRPKM values in ES cells/iPSCs to the 
cRPKMs in all other cell lines and differentiated tissues using a Wilcoxon rank- 
sum test after quantile normalization. Splicing factors with Bonferroni-corrected 
P-values <0.05 were considered significantly differentially expressed (Supplemen- 
tary Table 4). 
Splicing code analyses. The feature vectors for each species were produced by 
extracting sequence-based features from alternatively spliced exons, their adjacent 
constitutive exons, and 300 nucleotides of flanking intronic sequence. The features 
used were a subset of those defined previously”, with the following differences: (1) 
all sequence length features are now in the log domain; (2) owing to a lack of 
comprehensive transcript libraries and the corresponding uncertainty about 
downstream consequences of frame shifts, premature termination codon features 
were excluded; and (3) conservation scores and conservation-weighted motifs 
were excluded from the feature set. In addition, related features (that is, consensus 
recognition sequences for a given splicing factor inferred by different methods) 
were combined and included as independent features. 

To identify features strongly associated with ES-cell-differential exon inclusion 
or exclusion, we compared 172 ES-cell-differential exons and 908 background 
exons for human, along with 102 ES-cell-differential exons and 811 background 
exons for mouse. Associations between features and ES-cell-differential splicing 
were detected using Pearson correlation. For each feature, we computed the cor- 
relation between its value and the difference in average PSI values in ES cells and 
non-ES cells, across exons. To obtain more accurate correlation values, we con- 
sidered two scenarios: (1) a positive scenario in which the differences in average 
PSI values in ES cells and non-ES cells are larger than 25%; and (2) negative 
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scenario in which the differences in average PSI values in ES cells and non-ES 
cells are smaller than —25%. 

CLIP-seq analysis. We used recently described Mbnl1 CLIP-seq data from C2C12 
cells*’. To estimate the fractions of ES-cell-differential and background alternative 
splicing events that are associated with MBNL1 binding, we asked whether CLIP 
binding clusters overlap the alternative exon and/or flanking intron sequences of 
each event. CLIP binding clusters were defined as previously described*". In short, 
CLIP-seq tags were trimmed of adapters and then collapsed to remove redundant 
sequences. These tags were mapped to genome and a database of splice junctions 
using Bowtie. To identify CLIP clusters lying within genic regions, gene bound- 
aries were first defined using RefSeq, Ensembl and UCSC tables. For each window 
of 30 nucleotides covered by at least one CLIP-seq tag, a test was performed to 
assess whether the tag density in the window exceeded that which is predicted by a 
simple Poisson model which accounts for gene expression and pre-mRNA length. 
An alternative splicing event was considered to have an overlapping MBNL1 
binding cluster if the mid-point of the cluster is located within the alternative 
exon, within 300 nucleotides of the 5’ or 3’ ends of upstream or downstream 
flanking introns, and/or within 30 nucleotides within the 3’ end of Cl exon or 
the 5’ end of C2 exon. Only alternative splicing events that had significant read 
coverage (see above) in at least one of two C2C12 samples used were analysed. In 
total, 57 ES-cell-differential alternative splicing events and 601 background 
alternative splicing events were compared. 

To generate an RNA regulatory map” highlighting MBNL1 binding sites in 
relation to ES-cell-differential alternative splicing events with either higher (ES- 
cell-included) or lower (ES-cell-excluded) exon inclusion levels in ES cells versus 
other cell lines and tissues, we applied the following procedure: for each nucleotide 
position from the regions described above and from three sets of alternative 
splicing events (ES-cell-included, ES-cell-excluded and background), we counted 
the average number of MBNLI CLIP-seq tags. To minimize the impact of outliers 
with extreme read density, we limited the maximum count per event to an average 
of ten reads per position within each region. To normalize the length of the 
alternative splicing exon, we divided each exon into 100 bins and uniquely 
assigned each position to the integer of 100*position/exon_length, with a relative 
weight inversely related to the length of the alternative splicing exon. To draw the 
map, we used sliding windows of 30 nucleotides for the intronic regions and 
25 nucleotides for the length-corrected exons (total of four windows shown). 
Evolutionary conservation of ES-cell-differential events. We analysed three 
different aspects of conservation of the human and mouse ES-cell-differential 
alternative exons*’. To determine whether the alternative exon is conserved at 
the genomic level, we performed a lift-over of the exon coordinates using 
Galaxy (https://main.g2.bx.psu.edu/). Exons with a unique lift-over hit in the other 
species, and with AG (splicing acceptor) and/or GT (splicing donor) sites were 
considered to be genome-conserved in the other species. In addition, if the ortho- 
logous exon has a PSI of <95% and >5% in at least one sample from each species, 
alternative splicing of the exon was defined as conserved. Finally, to assess whether 
ES-cell-differential regulation is conserved, we applied two criteria: (1) the exons 
are independently detected as ES-cell-differential in human and mouse using the 
criteria as described above (total = 25 alternative splicing events); and (2) the 
orthologous exons must meet minimal read coverage requirements (also as 
described above) to afford direct comparison. 

Analyses of function and protein domain enrichment. To investigate whether 
ES-cell-differential events are significantly enriched in genes with specific func- 
tional associations and/or protein domains, we used the online tool DAVID 
(http://david.abcc.nciferf.gov/)“*** (with annotations from levels 3, 4 and 5 in 
the GO hierarchy), KEGG pathways and InterPro domains. As background, we 
used the genes with at least one alternative splicing event that met the minimal 
expression criteria described above (that is, detection in =2 ES cells and =3 dif- 
ferentiated cell/tissue types). The main clusters of functionally related genes 
enriched in both human and mouse (as well as among the conserved) ES-cell- 
differential events (Supplementary Table 3) are associated with: actin cytoskeleton, 
plasma membrane (including cell junctions) and protein kinase-associated terms. 
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The bromodomain protein Brd4 insulates chromatin 
from DNA damage signalling 
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DNA damage activates a signalling network that blocks cell-cycle 
progression, recruits DNA repair factors and/or triggers sen- 
escence or programmed cell death’. Alterations in chromatin 
structure are implicated in the initiation and propagation of the 
DNA damage response’. Here we further investigate the role of 
chromatin structure in the DNA damage response by monitoring 
ionizing-radiation-induced signalling and response events with a 
high-content multiplex RNA-mediated interference screen of 
chromatin-modifying and -interacting genes. We discover that 
an isoform of Brd4, a bromodomain and extra-terminal (BET) 
family member, functions as an endogenous inhibitor of DNA 
damage response signalling by recruiting the condensin II chro- 
matin remodelling complex to acetylated histones through bromo- 
domain interactions. Loss of this isoform results in relaxed 
chromatin structure, rapid cell-cycle checkpoint recovery and 
enhanced survival after irradiation, whereas functional gain of this 
isoform compacted chromatin, attenuated DNA damage response 
signalling and enhanced radiation-induced lethality. These data 
implicate Brd4, previously known for its role in transcriptional 
control, as an insulator of chromatin that can modulate the signal- 
ling response to DNA damage. 

Detection and repair of damaged DNA is integral for cell survival 
and accurate transmission of genetic information to progeny. Defects 
in the DNA damage response (DDR) contribute to oncogenesis and 
genomic instability in tumours** and render tumour cells sensitive to 
DNA-damaging cancer therapy*. Early signalling events that trigger 
and transduce the DDR occur in the context of chromatin, and it 
is likely that modulation of chromatin structure plays a role in 
DDR signalling’. Histone proteins are known targets of DDR post- 
translational modification”, but a detailed understanding of the role 
of chromatin modulation in the DDR is lacking. 

To explore the role of chromatin modulation in the DDR, we 
developed a high-throughput, high-content quantitative microscopy 
assay multiplexed for early and late DDR endpoints, and applied this to 
an RNA-mediated interference (RNAi) library focused on proteins 
that interact with and modify chromatin (see Methods)’*. For each 
time point, cells were co-stained with yH2AX antibodies to measure 
early signalling events in the DDR, Hoechst 33342 to monitor cell- 
cycle progression and phospho-histone H3 (pHH3) to measure 
mitotic entry. At the latest time point, cleaved caspase-3 (CC3) was 
substituted for pHH3 to measure apoptotic cell death. The screening 
assay was validated with small molecule inhibitors of DDR signalling 
as well as RNAi directed against known components of the DDR 
pathway (Supplementary Figs 1-4). 

The most pronounced increase in yYH2AX foci number, size and 
intensity after ionizing radiation was observed at 1 and 6h after 


knockdown of Brd4; this remained elevated at 24h (Fig. la, b and 
Supplementary Fig. 4). Eight hairpins directed against Brd4 showed 
this effect, making off-target effects unlikely (Fig. 1a and Supplement- 
ary Fig. 4). Neither Brd4 knockdown in the absence of irradiation 
(Fig. 1b) nor knockdown of other bromodomain-containing proteins 
(Figs 1b and Supplementary Fig. 4) significantly altered yH2AX. 
Increased ionizing-radiation-induced yH2AX after Brd4 loss was 
further confirmed using short interfering RNA (siRNA) oligonucleo- 
tides targeting additional independent Brd4 sequences (Fig. 1f and 
Supplementary Fig. 5). 

Brd4 encodes three splice isoforms (A, B and C in Fig. 1c). Each 
isoform contains two amino (N)-terminal bromodomains (BD1 and 
BD2) that bind acetylated lysine, and an extra-terminal (ET) domain 
recently reported to interact with several chromatin-binding proteins’. 
The A isoform contains a carboxy (C)-terminal domain (CTD) that 
functions as a transcriptional co-activator with the pTEFb complex'®”’. 
This region is notably absent in the B and C isoforms, and in the B 
isoform it is replaced with a divergent short 75 amino-acid segment. All 
three Brd4 isoforms are expressed in U2OS cells, and the short hairpin 
RNAs (shRNAs) used in our screen targeted all three isoforms 
(Supplementary Table 1). We confirmed that a single distinct siRNA 
that was active against all Brd4 isoforms replicated the Brd4 loss-of- 
function phenotype of elevated ionizing-radiation-induced yYH2AX 
(Supplementary Fig. 5). 

To establish the relative effects of the isoforms on the DDR, we 
performed gain-of-function experiments. Overexpression of Brd4 iso- 
form B most potently suppressed ionizing-radiation-induced yYH2AX 
foci (Fig. 1d). We designed isoform-specific siRNAs to reduce expres- 
sion of isoform A or B messenger RNA (mRNA) (Fig. le) and protein 
(Supplementary Fig. 5) selectively; selective targeting of isoform C was 
not technically possible owing to complete coding sequence overlap 
with isoforms A and B. We observed that selective depletion of Brd4 
isoform B, but not isoform A, increased H2AX phosphorylation over a 
wide range of ionizing radiation doses (Fig. 1). 

To investigate whether elevated yH2AX levels observed in Brd4- 
deficient cells resulted from increased production of ionizing-radiation- 
induced DNA double-strand breaks or from faulty double-strand break 
repair, we used pulsed-field gel electrophoresis to quantify double-strand 
breaks in control and Brd4 knockdown cells. As shown in Fig. 2a, Brd4 
knockdown had minimal effects on the generation and repair kinetics of 
double-strand breaks. These observations, together with our finding that 
individual yH2AX foci were larger and more intense in irradiated Brd4 
knockdown cells (Fig. 1b, Supplementary Fig. 4 and Supplementary 
Tables 1 and 2), indicate that there is enhanced signalling from damaged 
DNA in the absence of Brd4, rather than an increase in the amount of 
damage or repair deficiency. 
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Changes in overall chromatin structure can affect H2AX phosphor- 
ylation, probably by controlling the accessibility of signalling mole- 
cules to DNA damage sites'*"*. Interestingly, YH2AX foci form more 
readily in ‘open’ areas of euchromatin”™, histone acetylation has been 
linked to the ‘open’ chromatin state and histone deacetylase inhibitors 
are known to increase H2AX phosphorylation’’. We speculated that 
a bromodomain protein could influence H2AX phosphorylation 
through interaction with acetylated histones and effects on global 
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chromatin structure, and therefore performed micrococcal nuclease 
susceptibility experiments. Knockdown of Brd4 isoform B increased 
digestion by micrococcal nuclease, indicating a more ‘open’ overall 
chromatin structure, whereas knockdown of isoform A had minimal 
effects (Fig. 2b). Furthermore, we observed that cells transfected 
with Brd4 isoform B showed a distinct nuclear 4’,6-diamidino-2- 
phenylindole (DAPI) staining pattern, indicating a change in chro- 
matin structure (Fig. 2c). As shown in Fig. 2d, e, quantification of the 


Figure 2 | Brd4 isoform B limits H2AX 
phosphorylation through bromodomain-acetyl 
lysine-mediated effects on chromatin structure. 
a, Pulsed-field electrophoresis analysis of DNA 
from stable cell lines expressing indicated shRNA 
after 10 Gy IR (n = 3). b, Left: micrococcal nuclease 
assay of control or Brd4 knockdown cells. Right: 
line traces of representative gel lanes. c, Chromatin 
structure from cells expressing Flag-tagged Brd4 
isoform B (arrowheads) or A and C (arrows) shown 
by DAPI staining. d, Three-dimensional 
representation of nuclear DAPI staining intensity 
from cells in ¢ as indicated by coloured frames. 

e, DAPI pixel correlation from Brd4 isoform A, B, 
C and untransfected control cells (n = 3). 

f, Immunoblots (top) and quantification (bottom) 
of H2AX phosphorylation after 250 nM DMSO, or 
active (+) and inactive (—) JQ1 at 1h after 10 Gy 
ionizing radiation (n = 3). g, YH2AX signal 1h 
after 10 Gy IR in cells expressing green fluorescent 
protein (GFP)-wild-type Brd4 isoform B 
(arrowheads), isoform B with mutations that 
abrogate acetyl lysine binding of bromodomain 1 
(BD1) or 2 (BD2) (arrows), or wild-type Brd4 
isoform B in the presence of 250nM (—) JQ1 
(inactive) or (+) JQ] as indicated. 
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Figure 3 | Brd4 isoform B interaction with the condensin complex affects 
H2AX phosphorylation. a, Mass spectrometry identification of co- 
immunoprecipitated proteins from Flag-tagged Brd4 isoform-B-expressing 
cells. b, Identification of candidate Brd4 interactors by ranking chromatin 
modifier shRNAs from screen for elevated H2AX foci intensity, area and 
number at 1 and 6h after 10 Gy IR. Dashed red lines indicate top quartile. 

c, Intersection of two independent mass-spectrometry experiments (a) with the 
top quartile of candidates in b. Overlapping set includes Brd4, SMC2 and 
NCAPD3. d, Network representation of SMC proteins and relation to DNA 
damage signalling with protein-protein and kinase—substrate interactions 
collated from the literature. Protein-protein and kinase-substrate interactions 
shown by solid and dotted lines, respectively. Colours indicate condensin 
complex (blue), cohesin complex (pink), other SMC protein complexes (green), 
cell-cycle regulators (orange) and DNA damage signalling machinery (mint). 
Diamonds show mass spectrometry and high-content screening hits from a and 


nuclear staining texture showed a more heterogeneous DAPI intensity 
pattern, and significantly lower pixel-to-pixel correlation of DAPI 
staining in cells overexpressing isoform B, indicative of isoform-B- 
mediated alterations in global chromatin structure. Expression of 
isoform A had no effect on DAPI staining, whereas overexpression 
of isoform C had smaller effects than those observed with isoform B. 

Our finding that Brd4 isoform B expression affects global chromatin 
structure and attenuates H2AX phosphorylation in response to DNA 
damage led us to investigate the subcellular localization of isoform B in 
response to ionizing radiation. Immunofluorescence experiments 
showed that ionizing radiation did not grossly alter Brd4 isoform B 
nuclear localization, which tightly mirrored DNA patterns shown by 
DAPI staining (Supplementary Fig. 6a). Interestingly, subcellular frac- 
tionation of U2OS cells and extraction of chromatin-bound proteins 
demonstrated that irradiation caused enhanced isoform B association 
with the high salt-extractable chromatin fraction (Supplementary 
Fig. 6b, c), indicating increased association of isoform B with chro- 
matin after DNA damage. 

Bromodomains recognize epigenetic marks on chromatin by bind- 
ing to acetyl-lysine’®. We therefore tested the contribution of Brd4 
bromodomain interactions to alterations in YH2AX phosphorylation 
using JQ1, a small molecule inhibitor of BET bromodomains'’’. Only 
the active enantiomer of JQ1 caused increased H2AX phosphorylation 
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b. Border colours denote overlap of screens from c. The new interaction of Brd4 
with the condensin complex is indicated by the red line. e, Validation of 
isoform-B-condensin interaction with blotting immunoprecipitates from cells 
transfected with indicated Flag-tagged constructs. f, Immunoblot verification 
of SMC2 knockdown from cells transfected with SMC2 siRNA. g, Nuclear 
yH2AX signal from cells transfected with indicated combinations of control 
DNA, Brd4 isoform B and/or SMC2 siRNA. Data were quantified from ten 
fields of two independent experiments normalized to control cells. h, H2AX 
phosphorylation 1 h after 10 Gy IR in cells simultaneously expressing isoform B 
and control (arrows) or SMC2 siRNA (arrowheads). i, Chromatin staining 
pattern in cells simultaneously expressing isoform B and control (red frame) or 
SMC2 (blue frame) siRNA. j, Mean nuclear yH2AX signal in GFP-isoform-B- 
expressing cells with or without SMC2 knockdown. Data are from ten fields of 
two independent experiments, as in h, normalized to control untransfected cells. 


after irradiation in U2OS cells (Fig. 2f), similar to the effects observed 
after Brd4 isoform-B-specific knockdown. Furthermore, JQ1 treat- 
ment or Brd4 isoform B knockdown did not significantly alter total 
histone levels or levels of histone acetylation (Supplementary Figs 7 
and 8). Interestingly, overexpression of Brd4 isoform B led to altera- 
tion in the nuclear staining pattern of acetyl-lysine, closely mirroring 
the DAPI staining pattern induced by expression of isoform B 
(Supplementary Fig. 7b). 

The concentration of JQ1 that we used (250 nM) is consistent with 
the reported in vitro half-maximum inhibitory concentration for Brd4 
bromodomains 1 (BD1, 77nM) and 2 (BD2, 33nM)’”. To evaluate 
directly the role of each bromodomain in isoform B, we performed 
gain-of-function experiments using wild-type Brd4 in the absence or 
presence of JQ1, or constructs harbouring mutations that abrogate 
acetyl lysine binding by BD1 or BD2. Mutations in BD1, or addition 
of the active enantiomer of JQI, potently reversed the yH2AX- 
suppressive effects of isoform B expression (Fig. 2g). Notably, muta- 
tions that abrogate BD1 binding to acetyl-lysine also rescued the 
ionizing-radiation-induced cell death phenotype observed with Brd4 
isoform B gain-of-function (see below), implicating BD1 in the mech- 
anism of DNA damage inhibition (cf. Fig. 4b). 

To probe further the role of lysine acetylation on yYH2AX-Brd4 
effects, we examined the combined effects of histone deacetylase 
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Figure 4 | Brd4 isoform B affects 
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inhibitors and Brd4 knockdown. We found that when Brd4 isoform B 
knockdown was combined with exposure to 50 nM LBH5839, an inhib- 
itor of histone deacetylases 1-3 and 6 (ref. 18), H2AX phosphorylation 
was enhanced to a greater extent than with either treatment alone 
(Supplementary Fig. 9). This effect could be observed even in unirra- 
diated cells, although the total amount of H2AX phosphorylation 
remained lower than that seen in irradiated cells. Taken together, these 
findings indicate that Brd4 isoform B binding to acetylated regions of 
chromatin alters chromatin structure and limits H2AX phosphorylation. 

Brd4 also has a defined role in transcriptional modulation, largely 
through interactions of isoform A with the pTEFb transcriptional 
complex’®"'. To investigate the contribution of Brd4-driven transcrip- 
tional changes to the suppression of DNA damage signalling, we pro- 
filed mRNA expression patterns of cells stably expressing control or 
Brd4 shRNAs. Only one DDR-associated transcript, CHEK2, showed 
a differential expression change of twofold or more (Supplementary 
Fig. 10a). Importantly, transient Brd4 knockdowns with siRNA, or 
short-term inhibition with JQ1, both of which increased yH2AX foci 
formation after irradiation (Supplementary Fig. 5a and Fig. 2f), caused 
no change in CHEK2 mRNA levels (Supplementary Fig. 10b, c), and 
neither long-term nor short-term Brd4 knockdown affected the protein 
levels of several DDR molecules, including Chk2 (Supplementary 
Fig. 10d). Moreover, the suppression of DDR signalling by Brd4 isoform 
B overexpression was insensitive to transcription and translation inhi- 
bition with «-amanitin and cycloheximide, respectively (Supplemen- 
tary Fig. 11). 

As interactions between Brd4 and other protein complexes involved 
in modulating chromatin structure were probably responsible for the 
DDR effects we observed, we identified proteins co-immunoprecipitated 
with isoform B after DNA damage using mass spectrometry (Fig. 3a and 
Supplementary Fig. 12). From two independent experiments, we 
obtained a common set of 57 interacting proteins (Supplementary 
Tables 3 and 4). Because the DDR-relevant Brd4-binding proteins pre- 
sumably function in the same pathway as Brd4, we reasoned that loss of 
these proteins should show a phenotype similar to Brd4 loss-of-function. 
We therefore used our existing high-content screening data to create a 


Ae 
types commonly treated with 


radiotherapy. No IR, no ionizing 
radiation. f, Radiation survival effects 
of JQ1 in glioma cell lines measured at 
72h by CellTiterGlo (n = 3). g, Model 
for Brd4 effects on DNA damage 
signalling. 


Condensed chromatin, 
LyH2AX, signalling and survival 


list of the top quartile of genes ranked by increased yH2AX foci intensity, 
number and size at 1 and 6h after irradiation (Fig. 3b). The overlap of 
this list with the list of isoform-B-interacting proteins showed two mem- 
bers of the condensin II complex, SMC2 and CAPD3 (Fig. 3c, d). This 
finding was intriguing as the condensin II complex has a known role in 
chromatin compaction in both mitotic and interphase cells, and has 
been linked to DNA damage repair’’. We performed immunoprecipita- 
tion experiments after DNA damage, and found that the SMC2 and 
SMC4 components of the condensin II complex co-immunoprecipitated 
with Brd4 isoform B, whereas Brd4 isoform A had minimal co-association 
(Fig. 3e). To verify the role of this interaction on the yYH2AX effects we 
observed, we performed combined isoform B and SMC2 knockdown and 
assayed H2AX phosphorylation 24h after siRNA transfection, when 
knockdown of each protein is sub-maximal. We found that H2AX phos- 
phorylation was enhanced with combined knockdown over knockdown 
of either protein alone (Fig. 3f, g). Furthermore, in cells overexpressing 
isoform B, SMC2 knockdown could abrogate the suppressive effects of 
Brd4 on yH2AX, demonstrating a functional interaction between isoform 
B and the condensin II complex in modulating yH2AX (Fig. 3h, j). Finally, 
we noted that the effects of isoform B on the DAPI staining pattern of 
chromatin were abrogated by co-transfection of SMC2 siRNA, indicating 
that the Brd4-condensin II interaction is involved in chromatin structure 
alterations (Fig. 3i). 

We next investigated isoform B effects on other components of the 
DDR. We found that isoform B gain-of-function inhibited ionizing- 
radiation-induced foci formation of several other known DDR signal- 
ling components including 53BP1, phosphorylated ATM and several 
DDR signalling molecules containing the phospho-SQ DDR kinase 
substrate motif (Fig. 4a). In addition, overexpression of isoform B 
resulted in increased cell death after irradiation, an effect that was 
significantly diminished by mutation of BD1 (Fig. 4b). The cell death 
observed in Brd4 isoform B overexpressing cells seems to result from 
mitotic catastrophe, consistent with a loss of DDR signalling that 
results in failed cell-cycle arrest (Supplementary Fig. 13). We also 
investigated the effect of isoform B knockdown on DDR-induced 
cell-cycle arrest and survival. Interestingly, isoform B loss-of-function 
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allowed increased cell survival with more rapid and efficient recovery 
from cell-cycle arrest after irradiation, complementing the inverse 
findings observed with isoform B gain-of-function (Fig. 4c, d). 

Given the effects of Brd4 isoform B on ionizing-radiation-induced 
DDR signalling and survival, we considered that isoform B might have a 
role in tumour responses to irradiation. We screened a panel of estab- 
lished cell lines from several human tumour types commonly treated 
with radiotherapy for YH2AX effects using the JQ] inhibitor. Several cell 
types showed increased ionizing-radiation-induced H2AX phosphory- 
lation with JQ1 treatment, including breast, prostate and particularly 
glioma cancer cell lines (Fig. 4e). Just as we had observed with U20S 
cells, irradiation had the expected killing effect on dimethylsulphoxide 
(DMSO)-treated glioma cells; however, this killing effect was markedly 
reduced in JQI-treated glioma cells, consistent with our finding of 
increased DDR signalling and radioresistance with decreased Brd4 func- 
tion (Fig. 4f). Conversely, overexpression of Brd4 isoform B in glioma 
cells inhibited H2AX phosphorylation, consistent with decreased DDR 
signalling upon Brd4 gain-of-function (Supplementary Fig. 14). 

We conclude that structural alterations in chromatin mediated by 
Brd4 acetyl lysine binding function to attenuate the DNA damage 
signalling response to ionizing radiation. These effects on DDR sig- 
nalling are consistent with the induction of a chromatin structure that 
is inhibitory to the formation of yH2AX in the case of higher levels of 
Brd4 isoform B expression, or a more ‘open’ chromatin structure that 
facilitates YH2AX foci formation when Brd4 expression is reduced, or 
after pharmacological inhibition of bromodomain binding (shown 
schematically in Fig. 4g). 

Our data indicate that Brd4 affects DDR signalling through 
mechanisms distinct from known transcriptional interactions with 
the P-TEFb transcriptional complex. The relevant Brd4 isoform that 
modulates the DDR, isoform B, lacks the pTEFb-interacting region. In 
addition, chemical inhibition of transcription/translation had no effect 
on the ability of Brd4 to suppress DDR-induced yH2AX. This finding 
is in line with the recent identification of other chromatin-interacting 
proteins such as KAP-1 and Brg] that have roles in DNA damage 
signalling that do not seem to arise directly from the transcriptional 
activity that these molecules also possess’*”°. Rather, the enhancement 
of several parameters of yYH2AX foci after Brd4 knockdown, including 
their size and intensity, in addition to their number, point to a role for 
Brd4 in limiting the propagation of DDR signalling after ionizing 
radiation. This effect seems to involve the recruitment of a chro- 
matin-condensing complex to sites of acetylation, a new role for 
Brd4. In agreement with this, overexpression of Brd4, even in the 
absence of damage, resulted in alterations of chromatin structure and 
nuclear acetylation patterns, consistent with a model of Brd4 isoform B 
binding to and occluding acetyl-lysine sites on chromatin and recruit- 
ing chromatin compaction machinery. These findings implicate bro- 
modomain-mediated interactions in modulating specific chromatin 
structures that inhibit the propagation of DDR signalling in chro- 
matin!”"°, and indicate that Brd4 isoform B alters the threshold res- 
ponse of YH2AX to DNA damage. 


METHODS SUMMARY 


Image-based high-content screening was performed in 384-well plate format using 
an arrayed lentiviral shRNA library from The RNAi Consortium. Screen images 
were acquired with a Cellomics microscope (Thermo Scientific) and quantified 
using CellProfiler software. siRNAs and antibodies were from commercial sources. 
We used Affymetrix U133 Plus 2.0 arrays for expression profiling. Mass spectro- 
metry data from Brd4 immunoprecipitates after SDS-PAGE was acquired with an 
Orbitrap XL instrument (Thermo Scientific), and data analysed with Mascot 
software. Interactions for network analysis were hand-curated from primary lit- 
erature using the keywords ‘DNA damage’, ‘cell cycle checkpoint’, ‘chromatin 
structure’, ‘ATM/ATR’, ‘Chk1/Chk2’ and ‘SMC proteins’. Further details are pro- 
vided in the Methods. 
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Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Antibodies and stains. Mouse monoclonal antibodies against yH2AX were from 
Upstate/Millipore (catalogue number 05636), Actin (Sigma, catalogue number 
A5441), phospho-ATM Serine 1981 (Rockland, catalogue number 200-301-400), 
Flag (Sigma, catalogue number F3165), ornithine decarboxylase (Abcam, catalogue 
number ab66067), RAD50 (GeneTex, catalogue number GTX70228), NBS1 
(Abcam, catalogue number ab49958), MDC1 (Novus, catalogue number NB100- 
396) and Lamin (Millipore, catalogue number 05-714). Rabbit polyclonal and 
monoclonal antibodies against Brd4 were from Abcam (catalogue number 
Ab46199) and Pan-Brd4 from Sigma (catalogue number AV39076), 53BP1 
(Novus, catalogue number NB100-304), CHEK2 (Cell Signaling Technology, cata- 
logue number 2662), total H2AX (Abcam, catalogue number ab11175), phospho- 
SQ (Cell Signaling Technology, catalogue number 2851), MRE11 (Novus, catalogue 
number NB100-142), cleaved caspase 3 (Cell Signaling Technology, catalogue 
number 9664), SMC2 (Cell Signaling Technology, catalogue number 5329), 
SMC4 (Cell Signaling Technology, catalogue number 5547), phopho-histone H3 
(Upstate/Millipore, catalogue number 06570 and BD/Pharmingen catalogue num- 
ber 559565). DNA stains were Hoechst 33342 (Invitrogen, catalogue number 
H1399) propidium iodide (Invitrogen, catalogue number P1304MP) and ethidium 
bromide (Invitrogen, catalogue number 15585011). Fluorescent antibodies were 
from Invitrogen: goat anti-rabbit and goat anti-mouse Alexa 488, 555 and 647 
(catalogue numbers A11001, A21422, A21235, A21238, A21428 and A21244). 
Small molecule inhibitors. Brd4 bromodomain inhibitor (+)JQ1 and its inactive 
enantiomer (—)JQ1 were synthesized as described'’ and were used at 250 nM. 
a-Amanitin (catalogue number A2263) and cycloheximide (catalogue number 
C4859) were from Sigma and were used at concentrations as indicated («-amanitin 
1-16 1M; cycloheximide 35-560 1M). UCNO1 was from Sigma (catalogue number 
U6508) and was used at concentrations of 0.003-10 1M. Caffeine was from Sigma 
(catalogue number C0750) and was used at concentrations of 10-25 mM. LBH589 
was a gift from J. Bradner). 

RNAi library. shRNA was applied to cells using a high-titre arrayed lenti-viral 
library maintained in the pLKO_TRCO001 vector as described’. 

Image-based screens. For shRNA screens and small molecule tests, human U2OS 
osteosarcoma cells (ATCC HTB-96) were grown in DMEM + Pen/Strep + 10% 
v/v EBS (complete media) at 37 °C in a 5% CO) atmosphere. All screens were 
performed at passage 10-15. Cells were tested for mycoplasma by PCR before 
seeding and infection. U2OS cells were seeded with a MicroFill (Biotek) in 384- 
well black, clear bottom plates (Greiner) at a density of 300 (shRNA) cells per well 
in 50 ul of media, and allowed to attach overnight at 37 °C in a 5% CO, atmo- 
sphere. For shRNA screens, the media was exchanged the following day to com- 
plete media with 8 pg ml” * polybrene using a JANUS workstation (PerkinElmer). 
Virus infection was performed on an EP3 workstation (PerkinElmer) with 1.5 pl of 
high-titre retrovirus. All plates had two wells infected with 1.5 pl of control virus 
with shRNA directed against H2AX. Plates were centrifuged in a swinging-bucket 
rotor at 1180g for 30 min after infection and returned to the incubator overnight. 
The plates were then selected with 2.5 1g ml’ puromycin for 48 h, and allowed to 
proliferate in complete media for another 48 h, with media exchanges performed 
on the JANUS or RapidPlate (Qiagen) liquid handling workstations. Eight wells in 
each plate were not selected with puromycin. For small molecule testing, cells were 
plated at 500 cells per well in 384-well plates. The day after plating, small molecules 
at different concentrations in 100 nl DMSO were pin transferred to cells with a 
CyBio robot, and cells were propagated for 16h. For both small molecule and 
shRNA screens, four plates were created in replicate for the time points outlined 
below. Four wells were left untreated in each plate, and received 25 mM caffeine in 
complete media 1h before irradiation. All plates were treated with 10 Gy of 
667 keV X-rays from a '*’Cs source in a Gammacell irradiator (Atomic Energy 
of Canada). A 0h control plate was not irradiated. The plates were returned to the 
incubator and fixed with 4.4% w/v paraformaldehyde in phosphate-buffered saline 
(PBS) at 1, 6 and 24h after irradiation. Plates were stored in PBS at 4 °C before 
staining. Fixed plates were washed three times with PBS and blocked with 24 ll of 
GSDB (0.15% goat serum, 8.33% goat serum, 120 mM sodium phosphate, 225 mM 
NaCl) for 30 min. The 0, 1 and 6h plates were incubated with 1:300 dilutions in 
GSDB of primary mouse monoclonal anti-yH2AX (Ser 139), and rabbit polyclonal 
anti-pHH3 antibody. For the 24h plates, we substituted 1:300 rabbit polyclonal 
anti-cleaved caspase 3 for the pHH3 antibody. All plates were incubated overnight 
at 4 °C, washed and stained with a secondary antibody mix containing 10 1g ml 
Hoescht 33342, 1:300 goat anti-mouse polyclonal-Alexa Fluor 488 and goat anti- 
rabbit polyclonal-Alexa Fluor 555 in GSDB. After a second overnight incubation at 
4 °C, the plates were washed three times in PBS and stored in 50 ul per well 50 1M 
Trilox (Sigma) in PBS at 4 °C. 

Imaging and image analysis. Plates were allowed to equilibrate to room temper- 
ature for 30 min and imaged ona Cellomics ArrayScan VTI automated microscope 
with a X20 objective lens. The acquisition parameters were the same for each 
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shRNA or chemical inhibitor. Six fields per well were imaged, with three channels/ 
field (DAPI, fluorescein and rhodamine) for a total of 18 acquired images per well. 
Images were segmented and analysed with CellProfiler cell image analysis software. 
The imaging pipeline used to segment the images is available on request. Cell mor- 
phology and intensity data were acquired on a per image and per cell basis, and 
exported into a mySQL database. The data were visualized with SpotFire (TIBCO) 
and CellProfiler Analyst. 

Immunofluorescence microscopy. U2OS cells were plated on number 1 glass 
coverslips (VWR) and were cultured in DMEM + Pen/Strep + 10% v/v FBS 
(complete media) at 37 °C in a 5% CO, atmosphere, then exposed to 10 Gy ionizing 
radiation from a ‘*’Cs source in a Gammacell irradiator (Atomic Energy of 
Canada), fixed in methanol and processed for immunofluorescence using the 
antibodies indicated above. Images were captured on a Zeiss Axiophot II micro- 
scope with a Hamamatsu CCD (charge-coupled device) camera and processed with 
OpenLab/Volocity software. We used CellProfiler (www.CellProfiler.org) or 
Image] software (http://rsb.info.nih.gov/nihimage]) for quantitative image analysis. 
RT-PCR. Total RNA was extracted from 10° U20S cells expressing either control 
or Brd4-directed shRNA, with an RNeasy kit (Qiagen). Complementary DNA was 
generated with oligo(dT) primers with SuperScript reverse transcriptase (Invitrogen) 
according to the manufacturer’s instructions. These complementary DNAs were 
used as templates for linear-range PCR amplification or quantitative real-time PCR 
with SYBR green master mix on an Applied Biosystems 7500 with the following 
primers: forward 5'-CTC CTC CTA AAA AGA CGA AGA-3’ and reverse (pan- 
Brd4 isoform) 5'-TTC GGA GTC TTC GCT GTC AGA GGA G-3’, (Brd4 isoform 
A) 5'-GCC CCT TCT TIT TTG ACT TCG GAG C-3’, (Brd4 isoform B) 5'-GCC 
CTG GGG ACA CGA AGT CTC CAC T-3’, (Brd4 isoform C) 5’-CCG TTT TAT 
TAA GAG TCC GTG TCC A-3', (CHEK2) forward 5'-ACAGATAAATAC 
CGAACATACAGC-3’ and reverse 5'-GACGGCGTTTTCCTTTCCCTACAA-3’, 
and using (GAPDH) primers forward 5'-GATGCCCTGGAGGAAGTGCT-3’ and 
reverse 5'-AGCAGGCACAA CACCACGTT-3’ as control for normalization. 
Expression profiling and analysis. Total RNA was collected from stable U2OS 
cells expressing Brd4 or control shRNA using RNeasy (Qiagen), labelled and 
analysed on the Affymetrix U133 Plus 2.0 array. Unsupervised clustering of 
expression data was performed using the R package pvclst. LIMMA”’ was used 
to identify important changes in expression between Brd4 knockdown and control 
cells. Data were deposited in the US National Institutes of Health Gene Expression 
Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc = GSE30700). 
Subcellular fractionation. U2OS cells expressing Flag-tagged Brd4 isoforms were 
lysed in hypotonic conditions (10 mM Hepes, 10 mM NaCl, 25mM KCl, 1mM 
MgCl, 0.1mM EDTA, pH 7.4 with protease inhibitors) and subjected to flash 
freezing in liquid nitrogen 1h after mock treatment or exposure to 10 Gy of 
ionizing radiation with a '°’Cs source in a Gammacell irradiator (Atomic 
Energy of Canada). Cells were thawed at room temperature and spun down at 
10,000g for 10 min. The supernatant was saved as the cytoplasmic fraction and 
concentrated down using trichloroacetic acid precipitation and reconstituted in 
2X Laemmli buffer. The pellet was re-suspended in high salt buffer (20 mM Hepes, 
0.5 mM DTT, 1.5 mM MgCl, 0.1% Triton X-100, 1 M NaCl, pH 7.4 with protease 
inhibitors) and left on ice for 30 min followed by a high-speed spin at 100,000g for 
30min. The supernatant was saved as the high salt fraction and concentrated 
down using trichloroacetic acid precipitation and reconstituted in 2X Laemmli 
buffer. Sulphuric acid (0.4 N) was added to the high-speed pellet and left on ice for 
30 min, followed by a high-speed spin at 14,000g for 10 min. The supernatant was 
saved as the acid fraction and concentrated down using trichloroacetic acid pre- 
cipitation and reconstituted in 2X Laemmli buffer. 

Western blotting and immunoprecipitation. Cells were treated with 10 Gy ion- 
izing radiation with a '°’Cs source in a Gammacell irradiator (Atomic Energy of 
Canada). For whole cell lysates, cells were trypsinized and lysed in LB (4% SDS, 
120 mM Tris, pH 6.8) with protease and phosphatase inhibitors (Complete mini 
EDTA-free and PhosSTOP, Roche Applied Science). For chromatin isolation, cells 
were trypsinized, re-suspended in low salt buffer (LSB: 10 mM Hepes 10mM 
NaCl, 25 mM KCl, 1.0mM MgCh, 0.1mM EDTA, pH 7.4 + protease inhibitors, 
as above), flash-frozen in liquid N2, thawed, pelleted at 10,000g for 10 min, re- 
suspended in high salt buffer (HSB: 20 mM Hepes, 1.0M NaCl, 0.5mM DTT, 
1.5mM MgCl, 0.1% Triton X-100 + protease inhibitors) for 45min on ice, 
pelleted at 100,000g for 30 min., and proteins from the supernatant were precipi- 
tated with trichloroacetic acid. For immunoprecipitation, U2OS cells expressing 
Flag-tagged Brd4 isoforms were lysed in low salt buffer (50 mM Tris HCl, pH 7.4, 
150 mM NaCl, 1 mM EDTA, 0.5% NP-40 with protease inhibitors) and subjected 
to flash freezing in liquid nitrogen 1 h after mock treatment or irradiation. Cells 
were thawed at room temperature and spun down at 10,000g for 10 min. The 
supernatant was removed and saved as the pre-immunoprecipitation cytoplasmic 
fraction. The nuclear pellet was re-suspended in low salt buffer, tip sonicated at 
4°C (35% amplitude, pulse 5s on and off for three cycles), and spun down at 
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14,000g for 10 min. The supernatant was collected as starting material for immu- 
noprecipitation using M2 Flag beads (Sigma Aldrich) overnight at 4 °C. The beads 
were then spun down and the first supernatant saved as the unbound fraction. The 
beads were washed five times with low salt buffer and proteins were solubilized in 
2X Laemmli buffer and boiled at 95 °C for 3 min before loading onto SDS-PAGE. 
Samples were processed after SDS-PAGE for gel band cutting and in gel tryptic 
digestion for mass spectrometry or western blotting to detect pulldown of the 
condensin II complex (SMC2 and SMC4 proteins) with Brd4 isoforms. SDS- 
PAGE and western blot was according to the methods of Laemmli and Towbin 
using either a Li-cor Odyssey scanner or horseradish-peroxidase-coupled second- 
ary antibodies (Bio-Rad) and Western Lightning enhanced chemiluminenscene 
(PerkinElmer) for visualization of bands. 

Pulsed-field gel electrophoresis and micrococcal nuclease assay. For pulsed- 
field gel analysis, control and BRD4 knockdown cells were plated at 1 X 10° cells 
per plate, exposed to 10 Gy ionizing radiation with a 197Cs source in a Gammacell 
irradiator (Atomic Energy of Canada) and collected at 0.5, 1, 2, 3 and 5h. Cells 
were trypsinized, diluted to 2 x 10° cells and embedded in agarose plugs. The 
agarose plugs were exposed to Proteinase K (1 mgml ') in 500mM EDTA, 1% 
N-lauryl Sarcosyl, pH 8.0, for 48h, washed 3 X 1h with TE buffer, loaded onto a 
0.675% agarose gel and separated under pulsed-field conditions with a Rotaphor 
6.0 (Biometra). Nuclei from control and Brd4 knockdown cells were isolated by 
hypotonic lysis and micrococcal nuclease assays performed as described by Carey 
and Smale”. 

Flow cytometry. U2OS cells were plated and transiently transfected GFP trans- 
genes or siRNA as indicated, exposed to varying doses of ionizing radiation from a 
87Cg Gammacell irradiator source (Atomic Energy of Canada) and collected at 
varying times as indicated by fixation with 4% formaldehyde (cell-death measure- 
ments) or directly extracted with 100% ethanol (cell-cycle measurements), and 
processed for flow cytometry using the antibodies listed above. Data were analysed 
using FlowJo (www.flowjo.com) software. 

Colony formation assays. Control and BRD4 knockdown cells were exposed to 
the indicated doses of ionizing radiation from a ‘*’Cs source in a Gammacell 
irradiator (Atomic Energy of Canada), or left untreated, trypsinized, counted 
and re-plated using serial dilutions. Colonies were propagated to the 10- to 15- 
cell stage (3-7 days), stained with Wright stain (Sigma) and counted with 
CellProfiler software or by averaging counts of ten fields from three independent 
observers using a dissection microscope to identify colonies of more than 15 cells. 
Constructs, shRNA and siRNA, and transfection. Full-length constructs of Brd4 
Isoform A (accession number NM_058243), B (accession number BC035266) and 
C (accession number NM_014299.2) were cloned into pEGFP-C1 (Clontech) and 


pFLAG-CMV2 (Sigma) by PCR. Bromodomain mutations were introduced using 
quickchange (Stratagene) using PCR primers: 5‘-AAA TTG TTA CAT CGC CAA 
CAA GCC TGG AGA TGA CGC AGT CTT AAT GGC AG-3’ and 5'-CTG CCA 
TTA AGA CTG CGT CAT CTC CAG GCT TGT TGG CGA TGT AAC AAT TT- 
3'. Cells were transfected with Fugene 6 (Roche) according to the manufacturer's 
instructions. shRNA directed against Brd4 were from the TRC library (see 
Supplementary Table 1), or created in the mir30-based pMLP vector (a gift from 
M. Hemann) with primer 5'-TGC TGT TGA CAG TGA GCG AAG ACA CA-3’ 
for Brd4. U2OS cell lines stably expressing this shRNA or control hairpins (inef- 
fective hairpins directed against human sequences of BAD and PUMA) were 
created using puromycin selection at 2 4gml '. STEALTH siRNA against pan- 
isoform BRD4, SMC2 and control were purchased from Invitrogen. Custom Brd4 
isoform-specific siRNA were synthesized from Dharmacon using the following 
sequences: isoform A specific 5’-GGG AGA AAG AGG AGC GUG AUU-3’ and 
isoform B specific 5'-GCA CCA GUG GAG ACU UCG UUU-3’. siRNA against 
SMC2 was from Dharmacon. For siRNA experiments, cells were transfected 
with Lipofectamine RNAiMax (Invitrogen) according to the manufacturer’s 
instructions. 

Mass spectrometry. Proteins from the Brd4 co-immunoprecipitation were exam- 
ined after SDS-PAGE by staining with Coomassie blue. Gel bands were excised, 
de-stained and processed for digestion with trypsin (Promega; 12.5ng pl? in 
50mM ammonium bicarbonate, pH 8.9). Peptides were loaded directly onto a 
column packed with C18 beads. The column was placed in-line with a tapered 
electrospray column packed with C18 beads on a Orbitrap XL mass spectrometer 
(Thermo Scientific). Peptides were eluted using a 120-min gradient (0-70% acet- 
onitrile in 0.2 M acetic acid; 50nlmin~!). Data were collected using the mass 
spectrometer in data-dependent acquisition mode to collect tandem mass spectra 
and examined using Mascot software (Matrix Science). 

Network analysis. Protein—protein and kinase-substrate interactions relevant to 
DNA damage signalling were hand curated from primary literature available in 
PubMed using the initial keywords ‘DNA damage’, ‘cell cycle checkpoint’, ‘chro- 
matin structure’, “ATM/ATR, ‘Chk1/Chk2’ and ‘SMC proteins’, and following 
reference lists. 


21. Smyth, G. K. Linear models and empirical bayes methods for assessing 
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Chromosome-specific nonrandom sister chromatid 
segregation during stem-cell division 


Swathi Yadlapalli®? & Yukiko M. Yamashita’? 


Adult stem cells undergo asymmetric cell division to self-renew and 
give rise to differentiated cells that comprise mature tissue’. Sister 
chromatids may be distinguished and segregated nonrandomly in 
asymmetrically dividing stem cells’, although the underlying mech- 
anism and the purpose it may serve remain elusive. Here we develop 
the CO-FISH (chromosome orientation fluorescence in situ hybrid- 
ization) technique’ with single-chromosome resolution and show 
that sister chromatids of X and Y chromosomes, but not autosomes, 
are segregated nonrandomly during asymmetric divisions of Drosophila 
male germline stem cells. This provides the first direct evidence, to 
our knowledge, that two sister chromatids containing identical genetic 
information can be distinguished and segregated nonrandomly 
during asymmetric stem-cell divisions. We further show that the 
centrosome, SUN-KASH nuclear envelope proteins and Dnmt2 
(also known as Mt2) are required for nonrandom sister chromatid 
segregation. Our data indicate that the information on X and Y chro- 
mosomes that enables nonrandom segregation is primed during 
gametogenesis in the parents. Moreover, we show that sister chro- 
matid segregation is randomized in germline stem cell overproli- 
feration and dedifferentiated germline stem cells. We propose that 
nonrandom sister chromatid segregation may serve to transmit dis- 
tinct information carried on two sister chromatids to the daughters 
of asymmetrically dividing stem cells. 

The Drosophila male germline stem cell (GSC) system is an excellent 
model system for the study of asymmetric stem cell division. GSCs can 
be identified at single-cell resolution at the apical tip of the testis, where 
they attach to a cluster of somatic hub cells, a major component of the 
stem-cell niche*. GSCs divide asymmetrically by orienting the mitotic 
spindle perpendicular to the hub’. We showed previously that the 
mother centrosome is inherited by the GSCs°. 

We adapted the CO-FISH (chromosome orientation fluorescence 
in situ hybridization) protocol, which allows strand-specific identifica- 
tion of sister chromatids’, combined with chromosome-specific probes’ 
(Fig. 1a). Using this method, we identified the sister chromatids of each 
chromosome in GSCs and their differentiating daughter gonialblasts 
(Fig. 1b and Supplementary Fig. 1). We found that sister chromatids of 
the Y chromosome are inherited with a strong bias during GSC division: 
In approximately 85% of cases, GSCs inherited the sister chromatid of 
the Y chromosome, whose template strand contains the (GTATT). 
satellite (and thus hybridizes to the Cy3-(AATAC), probe), and gonialblasts 
inherited the sister chromatid whose template contains the (AATAC)g 
sequence (and thus hybridizes to the Cy5-(GTATT)g probe; Fig. 1c, d). 
Using X-chromosome-specific probes, we found that the X chromosome 
shows a similar bias (Fig. le, f). Essentially the same results were obtained 
when the Cy5 probe for the X chromosome was replaced with a probe 
that is not complementary to the Cy3-labelled probe (Supplementary 
Fig. 2). Although both X and Y chromosomes show a similar bias in 
segregation (approximately 85:15), we found that the two chromosomes 
segregate independently of each other (Fig. 1g-i) (see Methods for details). 

Two major scenarios can explain the observed bias of approximately 
85:15. In the first scenario, approximately 85% of GSCs inherit the ‘red 


strand’ (that is, the sister chromatid containing the template strand 
that hybridizes to Cy3 probes) with near 100% accuracy, whereas 
approximately 15% of GSCs inherit the ‘blue strand’ with near 100% 
accuracy. This would indicate that GSCs maintain particular strands of 
the X and Y chromosomes forever (‘immortal strands’). In the second 
scenario, each GSC inherits the ‘red strand’ with 85% probability and 
the ‘blue strand’ with 15% probability at each division. In this case, 
GSCs do not retain immortal strands; instead, the ‘template strands’ 
switch approximately once in every seven divisions (15% ~ 1/6.7). To 
distinguish between these possibilities, we conducted a long-pulse experi- 
ment where flies were continuously exposed to 5-bromodeoxyuridine- 
containing medium (see Supplementary Fig. 3 for details). The results 
of this experiment clearly supported the second scenario. 

In contrast to X and Y chromosomes, we found that the autosomes 
(chromosomes 2 and 3) do not show biased segregation (~50:50; Fig. 2). 
Consistent with previous reports that homologous chromosomes are 
paired, even in non-meiotic cells in Drosophila’, we observed that two 
autosome signals corresponding to homologous chromosomes were 
always juxtaposed to each other (Fig. 2a—d). In spite of the lack of biased 
segregation with regard to which strands are inherited by GSCs, cells 
always inherited two Cy3 signals or two Cy5 signals, the mechanism 
and significance of which remain elusive. It should be noted that the 
repeat sequences used as probes for chromosome 2 and 3 also exist on 
the Y chromosome’, yielding a third ‘lone’ signal in addition to the paired 
autosome signals. The identity of the lone signal was confirmed by com- 
bining autosome probes and a Y chromosome probe, 488-(AATAC)6. 
The Y chromosome signal was always close to the lone signal (Fig. 2e, f). 
Importantly, the Y chromosome detected as a lone signal showed biased 
segregation, despite the fact that the paired autosome signals showed a 
random segregation pattern in the same set of samples (Fig. 2g). This 
result further confirms our observation that sister chromatids of the Y 
chromosome are segregated nonrandomly. 

Although many studies have reported biased sister chromatid segregation, 
the genes responsible for biased segregation have never been described. 
We found that centrosomin (cnn), a core component of the pericentriolar 
material’®, SUN domain protein KOI"’, and KASH domain protein KLAR” 
are required for biased sister chromatid segregation (Fig. 3, Supplemen- 
tary Table 1). It is well established that the LINC (linker of nucleoske- 
leton and cytoskeleton) complex, composed of SUN- and KASH-domain 
proteins, tethers the nucleus to cytoskeletal components (such as micro- 
tubules, which in turn connect to the centrosome) via the nuclear envelope’’. 
Thus, we speculate that specific sister chromatids are tethered to the 
mother centrosome of the GSC that is consistently located at the hub- 
GSC junction (see Fig. 4e). 

We further found that sister chromatid segregation of X and Y chro- 
mosomes was randomized in dnmt2 mutants (Supplementary Table 2a 
and Supplementary Fig. 4). Although some studies indicated that DNMT2 
has DNA methyltransferase activity'*’’, other studies showed that it 
functions as an RNA methyltransferase’® and that DNA methylation is 
barely detectable in the Drosophila genome’’. Therefore, the mech- 
anism by which DNMT2 participates in nonrandom sister chromatid 
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Figure 1 | Nonrandom segregation of Y and X 
chromosome strands during GSC divisions. 

a, Chromosome-specific probes used in this study. 
b, Schematic diagram of the CO-FISH procedure. 
Cy3- and Cy5-labelled probes for the Y 
chromosome are shown as an example. Green 
fluorescent protein-labelled PAVAROTTI (PAV- 
GFP)” (midbody/ring canal), SH-ADD-Venus”* 
or anti-ADD antibody (spectrosome) was used to 
identify GSC-gonialblast pairs. c—i, Representative 
images of CO-FISH results using Y chromosome 
probes (c, d), X chromosome probes (e, f), and both 
X and Y probes (g-i). Expected segregation 
patterns based on co-segregation versus random 
segregation are shown at the bottom of g, h andi. In 
all figures the Cy5 signal is indicated by solid 
arrowheads and the Cy3 signal by open 
arrowheads. An asterisk marks the position of the 
hub. N, number of GSC-gonialblast pairs scored. 
Data are presented as mean + standard deviation. 


Figure 2 | Autosomes are randomly 
segregated during GSC divisions. 
a-d, Representative images of CO- 
FISH results using chromosome 2 
probes (a, b), and chromosome 3 
probes (c, d). Lone signals that 
correspond to the Y chromosome 
are marked with ‘Y’. N, number of 
GSCs scored. An asterisk marks 
the position of the hub. e, f, A 
representative image showing that 
the lone signal of the (AACAC). 
probe (open arrowheads) is close to 
the (AATAC), signal (blue 
arrowhead). g, Summary of scoring 
results using chromosome 2 probes. 


(N = 26) 
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aa 
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Figure 3 | cnn, koi and klar are required for nonrandom sister chromatid 
segregation. a, b, Representative images of Y chromosome CO-FISH in cnn 
mutant. Open arrowheads indicate the Cy3-(AATAC), probe; closed 
arrowheads indicate the Cy5-(GTATT), probe; asterisk indicates the hub. 

c, d, Representative images of X chromosome CO-FISH in koi mutant. Open 
arrowheads indicate the Cy3-X probe; closed arrowheads indicate the Cy5-X 
probe; asterisk indicates the hub. 


segregation remains elusive. However, our analysis, using various cross- 
ing schemes (crosses of homozygous mother/father with heterozygous 
father/mother), indicates that DNMT2 confers heritable, DNA sequence- 
independent information on the X and Y chromosomes during game- 
togenesis in the parents, leading to nonrandom sister chromatid segregation 
of X and Y chromosomes in the GSCs of the progeny (Supplementary 
Table 2b). For example, in GSCs from flies that are genetically hetero- 
zygous (dnmt2*'~), where the X chromosome is inherited from a mutant 
mother (dnmt2-‘~) and the Y chromosome from a heterozygous father 
(dnmt2*'"), X chromosome segregation was randomized, whereas Y 
chromosome segregation remained nonrandom. These results suggest 
the striking possibility that the information that enables nonrandom 
sister chromatid segregation of X and Y chromosomes in adult stem 
cells is primed during gametogenesis in the parents, transmitted to the 
zygote on single X and Y chromosomes, and maintained through many 
cell divisions during embryogenesis and adult tissue homeostasis. 

We found that sister chromatid segregation of X and Y chromo- 
somes is randomized in GSC overproliferation induced by ectopic 
expression of UPD (also known as O§; Fig. 4a, b and Supplementary 
Table 3). UPD is a signalling ligand that is normally expressed exclu- 
sively in hub cells and activates the JAK-STAT pathway in GSCs and 
cyst stem cells to specify stem cell identity*. This finding indicates that 
nonrandom sister chromatid segregation is under the control of stem 
cell identity. However, it is unlikely that nonrandom sister chromatid 
segregation determines GSC identity, because the mutants defective in 
nonrandom segregation described above (cnn, koi, klar, dnmt2) do not 
show GSC overproliferation or depletion. 
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Figure 4 | Nonrandom segregation of Y and X chromosomes is disrupted 
in upd-overexpressing testes and dedifferentiated stem cells. a, b, 
Representative images of CO-FISH using the Y probe upon overexpression of 
UPD (nos-gal4> UAS-UPD). For this experiment we limited our analysis to 
GSCs juxtaposed to hub cells, because GSCs located away from the hub do not 
have a spatial reference point for assessment of the sister chromatid segregation 
pattern. N, number of GSC-gonialblast pairs scored. An asterisk marks the 
position of the hub. c, d, Representative images of CO-FISH using the Y probe 
in dedifferentiated GSCs. Differentiation was induced by heat-shock treatment 
of hs-Bam flies followed by a 5-day recovery period”. e, Model of nonrandom 
sister chromatid segregation (see text for details). 


We also found that sister chromatid segregation is randomized in 
dedifferentiated GSCs (Fig. 4c, d and Supplementary Table 3). Partially 
differentiated germ cells can revert back to GSC identity to replenish the 
stem-cell pool'*’. Although these dedifferentiated GSCs are apparently 
functional because they can produce differentiating spermatogonia 
and reconstitute spermatogenesis'*”°, they did not recover nonrandom 
sister chromatid segregation. This result may indicate that the informa- 
tion on X and Y chromosomes that allows nonrandom sister chromatid 
segregation is lost upon commitment to differentiation as a gonialblast. 
Consistent with our earlier observation that dedifferentiation increases 
during ageing”, we found that nonrandom sister chromatid segregation 
was compromised during ageing (at day 30, 63:37 for the X chromo- 
some (N = 35) and 68:32 for the Y chromosome (N = 28)). 

This study provides the first evidence that adult stem cells can distin- 
guish two sister chromatids, and further points to a model in which sister 
chromatids are distinctly recognized, leading to anchorage of particular 
strands to the mother centrosome through the SUN-KASH proteins 
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(Fig. 4e). Our data also indicate that nonrandom sister chromatid segre- 
gation does not necessarily mean that they are immortal”’. 

At present it is not clear why X and Y chromosomes segregate non- 
randomly. Considering the data presented in this study, we favour the 
possibility that certain epigenetic information is transmitted distinc- 
tively to GSCs and gonialblasts. Indeed, X and Y chromosomes are subject 
to various forms of epigenetic regulation, such as dosage compensation” 
and male-specific meiotic sex chromosome inactivation”. In addition, 
Stellate, a repetitive sequence that encodes a polypeptide known to reduce 
fertility, and Suppressor of Stellate (Su(Ste)), the Piwi-interacting RNA 
(piRNA) that suppresses Stellate expression, are located on the X and Y 
chromosomes, respectively”. Intriguingly, we observed that Stellate 
is derepressed in mutants of cnn, dnmt2, koi and klar (Supplementary 
Fig. 5), although determination of whether derepression of Stellate is 
due to a failure in nonrandom sister chromatid segregation awaits 
future investigation. Not surprisingly, we found that the mutants in 
which Stellate is derepressed show reduced fertility (Supplementary 
Fig. 6). 

Recently, it was shown that old versus new histones segregate asym- 
metrically during GSC divisions”®. Our study demonstrates that GSCs 
do not segregate old (immortal) DNA strands. Thus, the relationship 
between biased sister chromatid segregation and histone segregation 
remains elusive. In summary, our study presents the first evidence of 
chromosome-specific nonrandom sister chromatid segregation in adult 
stem cells and provides mechanistic insights into how cells segregate 
sister chromatids nonrandomly. 


METHODS SUMMARY 


For CO-FISH combined with immunofluorescence staining, newly eclosed flies 
(unless otherwise noted) were fed with 5-bromodeoxyuridine for ~10 h, followed 
by a period in non-5-bromodeoxyuridine medium (~10h). The testes were then 
immunostained as described previously”. Subsequently, testes were irradiated with 
ultraviolet light, followed by treatment with exonuclease III. Then, CO-FISH 
probes were hybridized to detect template strands. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Fly husbandry. All fly stocks were raised on Bloomington Standard Media at 25 °C 
unless otherwise noted. The following fly stocks were used: Ubi-Pavarotti-GFP, 
SH-adducin-Venus, cnn" /CyO, enn™?/CyO, koitRKO80- DF(2R)Exel6050/ 
CyO, Klar', Df(3L)emc-E12, P(EP)Mt2°*” (denoted dnmt2?* in the text), dnmt2™, 
dnmt2'®, Df(2L)ED775/CyO, hs-Bam, UAS-UPD/CyO, and nos-gal4. These stocks 
are described in FlyBase. 

Combined immunofluorescence staining and CO-FISH. Newly eclosed adult 
flies (day 0) were fed food containing 5-bromodeoxyuridine (950 ul 100% apple 
juice, 7 ug agar, and 5041 100mg ml 5-bromodeoxyuridine solution in a 1:1 
mixture of acetone and DMSO) for approximately 10h. After the feeding period, 
flies were transferred to regular fly food for approximately 10h. Because the 
average GSC cell cycle length is 12 h, most GSCs undergo a single S phase followed 
by mitosis during our feeding procedure. GSCs that have undergone more or less 
than one S phase or mitosis were excluded from our analysis by limiting scoring to 
GSC-gonialblast pairs that have complementary CO-FISH signals in the GSC and 
gonialblast (that is, red signal in one cell, blue signal in the other). All possible 
scenarios are explained in Supplementary Fig. 1. Samples were dissected in 1X 
PBS, fixed for 30-60 min with 4% formaldehyde in PBS, permeabilized for at least 
1h in PBST (0.1% Triton X-100 in PBS) and incubated with primary antibodies 
overnight at 4°C. Samples were then washed with PBST (20 min, three times), 
incubated overnight at 4°C with Alexa Fluor-conjugated secondary antibodies 
(1:200; Molecular Probes), and washed again with PBST (20 min, three times). 
Samples were fixed for 10 min with 4% formaldehyde followed by three washes in 
PBST for 5 min each. Samples were then treated with RNase A (2mgml * in 
water) for 10 min at 37 °C, washed with PBST for 5 min, and stained with 100 pl 
Hoechst 33258 (Sigma Aldrich) at 2 ug ml’ for 15 min at room temperature. The 
samples were then rinsed with 2X SSC, transferred to a tray, and irradiated with 
ultraviolet light in a UV Stratalinker 1800 (calculated dose, 5400 J m ”). Nicked 
5-bromodeoxyuridine strands were digested with exonuclease II] (New England 
Biolabs) at 3U pl | in buffer supplied by the manufacturer (50mM Tris-HCl, 
5mM MgCl, and 5 mM dithiothreitol (DTT), pH 8.0) at 37 °C for 10 min. Samples 
were rinsed once with PBST for 5 min and then fixed in 4% formaldehyde in PBS 
for 2min and washed three times for 5min each in PBST. To allow gradual 
transition into 50% formamide/2X SSC, samples were incubated sequentially 
for a minimum of 10 min each in 20% formamide/2X SSC, 40% formamide/2 


LETTER 


SSC, and 50% formamide/2X SSC. The hybridization mixture consisted of 50% 
formamide, 2 SSC, 10% dextran sulphate, 0.5 1g ml! Cy3-labelled probe, and 
0.5 tg ml! Cy-5-labelled probe. Fluorescence-labelled probes were obtained from 
Integrated DNA Technologies. The hybridization solution was added to the sam- 
ples and hybridization was carried out at 37 °C overnight. Using non-complement- 
ary pairs of probes for the X chromosome, we detected a similar bias in segregation 
pattern (Supplementary Fig. 2), excluding the possibility that annealing of com- 
plementary probes interferes with correct hybridization between the probes and 
the target sequences. Autosome probes were denatured in hybridization solution at 
65 °C for 3 min before hybridization. The samples were never heat-denatured. Asa 
critical control, hub cells, which are predominantly quiescent and, thus, do not 
incorporate 5-bromodeoxyuridine, did not show any CO-FISH signal (evident in 
all images). 

Following hybridization, samples were washed once in 50% formamide/2 SSC, 
once in 25% formamide/2X SSC, and three times in 2 SSC. Samples were then 
mounted in VECTASHIELD (H-1200, Vector Laboratories) and images were 
recorded using a Leica TCS SP5 confocal microscope with a 63 X oil immersion 
objective (numerical aperture = 1.4) and processed using Adobe Photoshop software. 
The primary antibodies used were rabbit anti-Vasa (1:200; Santa Cruz Biotechnology), 
mouse anti-Adducin-like (1:20; developed by H. D. Lipshitz and obtained from the 
Developmental Studies Hybridoma Bank (DSHB)), mouse anti-Armadillo (1:20; 
developed by Eric Wieschaus and obtained from DSHB), rabbit anti-Stellate 
(1:1,000, a gift of P. Zamore”’). The secondary antibodies used were Alexa Fluor 
594- and 488-conjugated secondary antibodies (1:200; Molecular Probes). 
CO-FISH with both X and Y probes. The X and Y probes were labelled such that 
GSCs retain the Cy3 signal in ~85% of cases. If segregation of X and Y chromo- 
somes is correlated, the probability that a GSC inherits two Cy3 signals will be 
approximately 85%, and that of inheriting two Cy5 signals will be approximately 
15%, whereas there will be few instances where a GSC inherits one Cy3 and one 
Cy5 signal. In contrast, if the X and Y chromosomes segregate asymmetrically inde- 
pendently of each other, the probability of GSCs inheriting two Cy3 signals will be 
72% (85% X 85%), that of inheriting two Cy5 signals will be 2% (15% X 15%), and 
that of inheriting one Cy3 and one Cy5 signal will be 26% (85% X 15% X 2). 


30. Fdérstemann, K. et al. Normal microRNA maturation and germ-line stem cell 
maintenance requires Loquacious, a double-stranded RNA-binding domain 
protein. PLoS Biol. 3, e236 (2005). 
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THE BIG CHALLENGES 
OF BIG DATA 


As they grapple with increasingly large data sets, 
biologists and computer scientists uncork new bottlenecks. 


Extremely powerful computers are needed to help biologists to handle big-data traffic jams. 


BY VIVIEN MARX 


iologists are joining the big-data club. 
With the advent of high-throughput 
genomics, life scientists are starting to 
grapple with massive data sets, encountering 
challenges with handling, processing and mov- 
ing information that were once the domain of 
astronomers and high-energy physicists’. 
With every passing year, they turn more 
often to big data to probe everything from 
the regulation of genes and the evolution of 
genomes to why coastal algae bloom, what 
microbes dwell where in human body cavities 


and how the genetic make-up of different can- 
cers influences how cancer patients fare’. The 
European Bioinformatics Institute (EBI) in 
Hinxton, UK, part of the European Molecular 
Biology Laboratory and one of the world’s larg- 
est biology-data repositories, currently stores 
20 petabytes (1 petabyte is 10" bytes) of data 
and back-ups about genes, proteins and small 
molecules. Genomic data account for 2 peta- 
bytes of that, a number that more than doubles 
every year’ (see ‘Data explosion). 

This data pile is just one-tenth the size of the 
data store at CERN, Europe's particle-physics 
laboratory near Geneva, Switzerland. Every 
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year, particle-collision events in CERN’s Large 
Hadron Collider generate around 15 petabytes 
of data — the equivalent of about 4 million 
high-definition feature-length films. But the 
EBI and institutes like it face similar data- 
wrangling challenges to those at CERN, says 
Ewan Birney, associate director of the EBI. He 
and his colleagues now regularly meet with 
organizations such as CERN and the European 
Space Agency (ESA) in Paris to swap lessons 
about data storage, analysis and sharing. 

All labs need to manipulate data to yield 
research answers. As prices drop for high- 
throughput instruments such as automated 
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> genome sequencers, small biology labs can 
become big-data generators. And even labs 
without such instruments can become big- 
data users by accessing terabytes (10’” bytes) 
of data from public repositories at the EBI or 
the US National Center for Biotechnology 
Information in Bethesda, Maryland. Each 
day last year, the EBI received about 9 mil- 
lion online requests to query its data, a 60% 
increase over 2011. 

Biology data mining has challenges all of 
its own, says Birney. Biological data are much 
more heterogeneous than those in physics. 
They stem from a wide range of experiments 
that spit out many types of information, such 
as genetic sequences, interactions of proteins 
or findings in medical records. The complexity 
is daunting, says Lawrence Hunter, a compu- 
tational biologist at the University of Colo- 
rado Denver. “Getting the most from the data 
requires interpreting them in light of all the 
relevant prior knowledge,” he says. 

That means scientists have to store large data 
sets, and analyse, compare and share them — 
not simple tasks. Even a single sequenced 
human genome is around 140 gigabytes in size. 
Comparing human genomes takes more than 
a personal computer and online file-sharing 
applications such as DropBox. 

In an ongoing study, Arend Sidow, a com- 
putational biologist at Stanford University in 
California, and his team are looking at specific 
changes in the genome sequences of tumours 
from people with breast cancer. They wanted 
to compare their data with the thousands of 
other published breast-cancer genomes and 
look for similar patterns in the scores of dif- 
ferent cancer types. But that is a tall order: 
downloading the data is time-consuming, 
and researchers must be sure that their com- 
putational infrastructure and software tools 
are up to the task. “If I could, I would routinely 
look at all sequenced cancer genomes,’ says 
Sidow. “With the current infrastructure, that’s 
impossible.” 

In 2009, Sidow co-founded a company 
called DNAnexus in Mountain View, Califor- 
nia, to help with large-scale genetic analyses. 
Numerous other commercial and academic 


DATA EXPLOSION 


The amount of genetic sequencing data stored 
at the European Bioinformatics Institute takes 
less than a year to double in size. 
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Andreas Sundquist says amounts of data are now 
larger than the tools used to analyse them. 


efforts also address the infrastructure needs of 
big-data biology. With the new types of data 
traffic jam honking for attention, “we now have 
non-trivial engineering problems’, says Birney, 


LIFE OF THE DATA-RICH 

Storing and interpreting big data takes both 
real and virtual bricks and mortar. On the EBI 
campus, for example, construction is under 
way to house the technical command centre 
of ELIXIR, a project to help scientists across 
Europe safeguard and share their data, and to 
support existing resources such as databases 
and computing facilities in individual coun- 
tries. Whereas CERN has one supercollider 
producing data in one location, biological 
research generating high volumes of data is 
distributed across many labs — highlighting 
the need to share resources. 

Much of the construction in big-data biol- 
ogy is virtual, focused on cloud computing 
— in which data and software are situated in 
huge, off-site centres that users can access on 
demand, so that they do not need to buy their 
own hardware and maintain it on site. Labs that 
do have their own hardware can supplement it 
with the cloud and use both as needed. They 
can create virtual spaces for data, software and 
results that anyone can access, or they can lock 
the spaces up behind a firewall so that only a 
select group of collaborators can get to them. 

Working with the CSC — IT Center for Sci- 
ence in Espoo, Finland, a government-run 
high-performance computing centre, the 
EBI is developing Embassy Cloud, a cloud- 
computing component for ELIXIR that offers 
secure data-analysis environments and is cur- 
rently in its pilot phase. External organizations 
can, for example, run data-driven experiments 
in the EBI’s computational environment, close 
to the data they need. They can also download 
data to compare with their own. 

The idea is to broaden access to computing 
power, says Birney. A researcher in the Czech 
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Republic, for example, might have an idea 
about how to reprocess cancer data to help the 
hunt for cancer drugs. If he or she lacks the 
computational equipment to develop it, he or 
she might not even try. But access to a high- 
powered cloud allows “ideas to come from any 
place’, says Birney. 

Even at the EBI, many scientists access 
databases and software tools on the Web 
and through clouds. “People rarely work on 
straight hardware anymore,’ says Birney. One 
heavily used resource is the Ensembl Genome 
Browser, run jointly by the EBI and the Well- 
come Trust Sanger Institute in Hinxton. Life 
scientists use it to search through, down- 
load and analyse genomes from armadillo to 
zebrafish. The main Ensembl site is based on 
hardware in the United Kingdom, but when 
users in the United States and Japan had dif- 
ficulty accessing the data quickly, the EBI 
resolved the bottleneck by hosting mirror 
sites at three of the many remote data centres 
that are part of Amazon Web Services’ Elastic 
Compute Cloud (EC2). Amazon's data centres 
are geographically closer to the users than the 
EBI base, giving researchers quicker access to 
the information they need. 

More clouds are coming. Together with 
CERN and ESA, the EBI is building a cloud- 
based infrastructure called Helix Nebula 
— The Science Cloud. Also involved are infor- 

mation-technology 


“Tf Tcould, I companies such 
wouldroutinely as Atos in Bezons, 
look at all France; CGI in Mon- 
sequenced treal, Canada; SixSq 
cancer genomes. _ in Geneva; and T-Sys- 
Withthe current tems in Frankfurt, 
infrastructure, Germany. 

that’s Cloud computing is 


particularly attractive 
in an era of reduced 
research funding, says Hunter, because cloud 
users do not need to finance or maintain hard- 
ware. In addition to academic cloud projects, 
scientists can choose from many commercial 
providers, such as Rackspace, headquartered 
in San Antonio, Texas, or VMware in Palo 
Alto, California, as well as larger companies 
including Amazon, headquartered in Seattle, 
Washington, IBM in Armonk, New York, or 
Microsoft in Redmond, Washington. 


impossible.” 


BIG-DATA PARKING 

Clouds are a solution, but they also throw 
up fresh challenges. Ironically, their prolif- 
eration can cause a bottleneck if data end 
up parked on several clouds and thus still 
need to be moved to be shared. And using 
clouds means entrusting valuable data to a 
distant service provider who may be subject 
to power outages or other disruptions. “I use 
cloud services for many things, but always 
keep a local copy of scientifically important 
data and software,” says Hunter. Scientists 
experiment with different constellations to 
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suit their needs and trust levels. 

Most researchers tend to download remote 
data to local hardware for analysis. But this 
method is “backward”, says Andreas Sundquist, 
chief technology officer of DNAnexus. “The 
data are so much larger than the tools, it makes 
no sense to be doing that.” The alternative is to 
use the cloud for both data storage and com- 
puting. If the data are on a cloud, researchers 
can harness both the computing power and the 
tools that they need online, without the need 
to move data and software (see ‘Head in the 
clouds’). “There’s no reason to move data out- 
side the cloud. You can do analysis right there,” 
says Sundquist. Everything required is avail- 
able “to the clever people with the clever ideas’, 
regardless of their local computing resources, 
says Birney. 

Various academic and commercial ventures 
are engineering ways to bring data and analysis 
tools together — and as they build, they have to 
address the continued data growth. Xing Xu, 
director of cloud computing at BGI (formerly 
the Beijing Genomics Institute) in Shenzen, 
China, knows that challenge well. BGI is one 
of the largest producers of genomic data in the 
world, with 157 genome sequencing instru- 
ments working around the clock on samples 
from people, plants, animals and microbes. 
Each day, it generates 6 terabytes of genomic 
data. Every instrument can decode one human 
genome per week, an effort that used to take 
months or years and many staff. 


DATA HIGHWAY 

Once a genome sequencer has cranked out its 
snippets of genomic information, or ‘reads, 
they must be assembled into a continuous 
stretch of DNA using computing and software. 
Xu and his team try to automate as much of 
this process as possible to enable scientists to 
get to analyses quickly. 

Next, either the reads or the analysis, or 
both, have to travel to scientists. Generally, 
researchers share biological data with their 
peers through public repositories, such as the 
EBI or ones run by the US National Center 
for Biotechnology Information in Bethesda, 
Maryland. Given the size of the data, this 
travel often means physically delivering hard 
drives — and risks data getting lost, stolen or 
damaged. Instead, BGI wants to use either its 
own clouds or others of the customer's choos- 
ing for electronic delivery. But that presents a 
problem, because big-data travel often means 
big traffic jams. 

Currently, BGI can transfer about 1 tera- 
byte per day to its customers. “If you transfer 
one genome ata time, it’s OK,” says Xu. “Ifyou 
sequence 50, it’s not so practical for us to trans- 
fer that through the Internet. That takes about 
20 days.” 

BGI is exploring a variety of technologies 
to accelerate electronic data transfer, among 
them fasp, software developed by Aspera in 
Emeryville, California, which helps to deliver 
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HEAD IN THE CLOUDS 


In cloud computing, large data sets are 
processed on remote Internet servers, 
rather than on researchers’ local computers. 


data for film-production studios and the oil 
and gas industry as well as the life sciences. 
In an experiment last year, BGI tested a fasp- 
enabled data transfer between China and the 
University of California, San Diego (UCSD). 
It took 30 seconds to move a 24-gigabyte file. 
“That's really fast,’ says Xu. 

Data transfer with fasp is hundreds of times 
quicker than methods using the normal Inter- 
net protocol, says software engineer Michelle 
Munson, chief executive and co-founder of 
Aspera. However, all transfer protocols share 
challenges associated with transferring large, 
unstructured data sets. 

The test transfer between BGI and UCSD 
was encouraging because Internet connec- 
tions between China and the United States are 
“riddled with challenges” such as variations 
in signal strength that interrupt data transfer, 
says Munson. The protocol has to handle such 
road bumps and ensure speedy transfer, data 
integrity and privacy. Data transfer often slows 

when the passage is 


“There’s no bumpy, but with fasp 
reason to move it does not. Trans- 
data outside the fers can fail when a 
cloud. You can file is partially sent; 
doanalysisright with ordinary Inter- 
there.” net connections, 


this relaunches the 
entire transfer. By contrast, fasp restarts where 
the previous transfer stopped. Data that are 
already on their way do not get resent, but 
continue on their travels. 

Xu says that he liked the experiment with 
fasp, but the software does not solve the data- 
transfer problem. “The main problem is not 
technical, it is economical,’ he says. BGI 
would need to maintain a large Internet con- 
nection bandwidth for data transfer, which 
would be prohibitively expensive, especially 
given that Xu and his team do not send out big 
data in a continuous flow. “If we only transfer 
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periodically, it doesn't make any economic 
sense for us to have this infrastructure, espe- 
cially if the user wants that for free,” he says. 

Data-sharing among many collaborators 
also remains a challenge. When BGI uses fasp 
to share data with customers or collaborators, 
it must have a software licence, which allows 
customers to download or upload the data for 
free. But customers who want to share data with 
each other using this transfer protocol will need 
their own software licences. Putting the data on 
the cloud and not moving them would bypass 
this problem; teams would go to the large data 
sets, rather than the other way around. Xu and 
his team are exploring this approach, alongside 
the use of Globus Online, a free Web-based file- 
transfer service from the Computation Institute 
at the University of Chicago and the Argonne 
National Laboratory in Illinois. In April, 
the Computation Institute team launched a 
genome-sequencing-analysis service called 
Globus Genomics on the Amazon cloud. 

Munson says that Aspera has set up a 
pay-as-you-go system on the Amazon cloud 
to address the issue of data-sharing. Later 
this year, the company will begin selling an 
updated version of its software that can be 
embedded on the desktop of any kind of com- 
puter and will let users browse large data sets 
much like a file-sharing application. Files can 
be dragged and dropped from one location to 
another, even if those locations are commercial 
or academic clouds. 

The cost of producing, acquiring and dis- 
seminating data is decreasing, says James 
Taylor, a computational biologist at Emory 
University in Atlanta, Georgia, who thinks 
that “everyone should have access to the 
skills and tools” needed to make sense of all 
the information. Taylor is a co-founder of an 
academic platform called Galaxy, which lets 
scientists analyse their data and share soft- 
ware tools and workflows for free. Through 
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Web-based access to computing facilities at 
Pennsylvania State University (PSU) in Uni- 
versity Park, scientists can download Galaxy's 
platform of tools to their local hardware, or 
use it on the Galaxy cloud. They can then plug 
in their own data, perform analyses and save 
the steps in them, or try out workflows set up 
by their colleagues. 

Spearheaded by Taylor and Anton 
Nekrutenko, a molecular biologist at PSU, 
the Galaxy project draws on a community of 
around 100 software developers. One feature 
is Tool Shed, a virtual area with more than 
2,700 software tools that users can upload, 
try out and rate. Xu says that he likes the col- 
lection and its ratings, because without them, 
scientists must always check if a software tool 
actually runs before they can use it. 


KNOWLEDGE IS POWER 

Galaxy is a good fit for scientists with some 
computing know-how, says Alla Lapidus, a 
computational biologist in the algorithmic 
biology lab at St Petersburg Academic Univer- 
sity of the Russian Academy of Sciences, which 
is led by Pavel Pevzner, a computer scientist at 
UCSD. But, she says, the platform might not 
be the best choice for less tech-savvy research- 
ers. When Lapidus wanted to disseminate the 
software tools that she developed, she chose 
to put them on DNAnexus’s newly launched 
second-generation commercial cloud-based 
analysis platform. 

That platform is also designed to cater to 
non-specialist users, says Sundquist. It is pos- 
sible for a computer scientist to build his or her 
own biological data-analysis suite with software 
tools on the Amazon cloud, but DNAnexus 
uses its own engineering to help researchers 
without the necessary computer skills to get to 
the analysis steps. 

Catering for non-specialists is important 
when developing tools, as well as platforms. The 
Biomedical Information Science and Technol- 
ogy Initiative (BISTI) run by the US National 
Institutes of Health (NIH) in Bethesda, Mary- 
land, supports development of new computa- 
tional tools and the maintenance of existing 
ones. “We want a deployable tool,” says Vivien 
Bonazzi, programme director in computational 
biology and bioinformatics at the National 
Human Genome Research Institute, who is 
involved with BISTI. Scientists who are not 
heavy-duty informatics types need to be able to 
set up these tools and use them successfully, she 
says. And it must be possible to scale up tools 
and update them as data volume grows. 

Bonazzi says that although many life 
scientists have significant computational 
skills, others do not understand computer 
lingo enough to know that in the tech world, 
Python is not a snake and Perl is not a gem 
(they are programming languages). But even if 
biologists can’t develop or adapt the software, 
says Bonazzi, they have a place in big-data sci- 
ence. Apart from anything else, they can offer 


valuable feedback to their computationally flu- 
ent colleagues because of different needs and 
approaches to the science, she says. 

Increasingly, big genomic data sets are being 
used in biotechnology companies, drug firms 
and medical centres, which also have specific 
needs. Robert Mulroy, president of Merrimack 
Pharmaceuticals in Cambridge, Massachu- 
setts, says that his teams handle mountains 
of data that hide drug candidates. “Our view 
is that biology functions through systems 
dynamics, he says. 

Merrimack researchers focus on interro- 
gating molecular signalling networks in the 
healthy body and in tumours, hoping to find 
new ways to corner cancer cells. They generate 
and use large amounts of information from the 
genome and other factors that drive a cell to 
become cancerous, says Mulroy. The company 
stores its data and conducts analysis on its own 
computing infrastructure, rather than a cloud, 
to keep the data private and protected. 

Drug developers have been hesitant about 
cloud computing. But, says Sundquist, that fear 
is subsiding in some quarters: some companies 
that have previously avoided clouds because of 
security problems are now exploring them. To 
assuage these users’ concerns, Sundquist has 
engineered the DNAnexus cloud to be compli- 
ant with US and European regulatory guide- 
lines. Its security features include encryption 
for biomedical information, and logs to allow 
users to address potential queries from audi- 
tors such as regulatory agencies, all of which is 
important in drug development. 


CHALLENGES AND OPPORTUNITIES 

Harnessing powerful computers and numer- 
ous tools for data analysis is crucial in drug dis- 
covery and other areas of big-data biology. But 
that is only part of the problem. Data and tools 
need to be more than close — they must talk to 
one another. Lapidus says that results produced 
by one tool are not always in a format that can 
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Various data-transfer protocols handle problems 
in different ways, says Michelle Munson. 


© 2013 Macmillan Publishers Limited. All rights reserved 


BIG DATA 


Arend Sidow wants to move data mountains 
without feeling pinched by infrastructure. 


be used by the next tool in a workflow. And if 
software tools are not easily installed, computer 
specialists will have to intervene on behalf of 
those biologists without computer skills. 

Even computationally savvy researchers can 
get tangled up when wrestling with software 
and big data. “Many of us are getting so busy 
analysing huge data sets that we don’t have 
time to do much else,’ says Steven Salzberg, 
a computational biologist at Johns Hopkins 
University in Baltimore, Maryland. “We have 
to spend some of our time figuring out ways to 
make the analysis faster, rather than just using 
the tools we have.” 

Yet other big-data pressures come from 
the need to engineer tools for stability and 
longevity. Too many software tools crash too 
often. “Everyone in the field runs into similar 
problems,” says Hunter. In addition, research 
teams may not be able to acquire the resources 
they need, he says, especially in countries 
such as the United States, where an academic 
does not gain as much recognition for soft- 
ware engineering as for publishing a paper. 
With its dedicated focus on data and software 
infrastructure designed to serve scientists, the 
EBI offers an “interesting contrast to the US 
model’, says Hunter. 

US funding agencies are not entirely ignor- 
ing software engineering, however. In addi- 
tion to BISTI, the NIH is developing Big Data 
to Knowledge (BD2K), an initiative focused on 
managing large data sets in biomedicine, with 
elements such as data handling and standards, 
informatics training and software sharing. And 
as the cloud emerges as a popular place to do 
research, the agency is also reviewing data- 
use policies. An approved study usually lays 
out specific data uses, which may not include 
placing genomic data on a cloud, says Bonazzi. 
When a person consents to have his or her data 
used in one way, researchers cannot suddenly 
change that use, she says. In a big-data age that 
uses the cloud in addition to local hardware, 
new technologies in encryption and secure 
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transmission will need to address such privacy 
concerns. 

Big data takes large numbers of people. BGI 
employs more than 600 engineers and software 
developers to manage its information-technol- 
ogy infrastructure, handle data and develop 
software tools and workflows. Scores of infor- 
maticians look for biologically relevant mes- 
sages in the data, usually tailored to requests 
from researchers and commercial customers, 
says Xu. And apart from its stream of research 
collaborations, BGI offers a sequencing and 
analysis service to customers. Early last year, 
the institute expanded its offerings with a 
cloud-based genome-analysis platform called 
EasyGenomics. 

In late 2012, it also bought the faltering US 
company Complete Genomics (CG), which 
offered human genome sequencing and 
analysis for customers in academia or drug 
discovery. Although the sale dashed hopes 
for earnings among CG's investors, it doesn't 
seem to have dimmed their view of the pros- 
pects for sequencing and analysis services. “It 
is now just a matter of time before sequencing 
data are used with regularity in clinical prac- 
tice,’ says one investor, who did not wish to be 
identified. But the sale shows how difficult it 
can be to transition ideas into a competitive 
marketplace, the investor says. 

When tackling data mountains, BGI uses 
not only its own data-analysis tools, but also 
some developed in the academic community. 
To ramp up analysis speed and capacity as data 
sets grow, BGI assembled a cloud-based series 
of analysis steps into a workflow called Gaea, 
which uses the Hadoop open-source software 
framework. Hadoop was written by volunteer 
developers from companies and universities, 
and can be deployed on various types of com- 
puting infrastructure. BGI programmers built 
on this framework to instruct software tools to 
perform large-scale 
data analysis across 
many computers at 
the same time. 

If 50 genomes are 
to be analysed and 
the results com- 
pared, hundreds of 
computational steps 
are involved. The 
steps can run either 
sequentially or in parallel; with Gaea, they 
run in parallel across hundreds of cloud-based 
computers, reducing analysis time rather like 
many people working on a single large puzzle 
at once. The data are on the BGI cloud, as are 
the tools. “If you perform analysis in a non- 
parallel way, you will maybe need two weeks 
to fully process those data,’ says Xu. Gaea takes 
around 15 hours for the same number of data. 

To leverage Hadoop’s muscle, Xu and his 
team needed to rewrite software tools. But the 
investment is worth it because the Hadoop 
framework allows analysis to continue as the 
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A simplified array of breast-cancer subtypes, produced by researchers at Merrimack Pharmaceuticals, 
who use their own computational infrastructure to hunt for new cancer drugs. 


data mountains grow, he says. 

They are still ironing out some issues with 
Gaea, comparing its performance on the cloud 
with its performance on local infrastructure. 
Once testing is complete, BGI plans to mount 
Gaea ona cloud such as Amazon for use by the 
wider scientific community. 

Other groups are also trying to speed up 
analysis to cater to scientists who want to use 
big data. For example, Bina Technologies in 
Redwood City, California, a spin-out from Stan- 
ford University and the University of Califor- 
nia, Berkeley, has developed high-performance 
computing components for its genome-analysis 
services. Customers can buy the hardware, 
called the Bina Box, with software, or use Bina’s 
analysis platform on the cloud. 


FROM VIVO TO SILICO 
Data mountains and analysis are altering the 
way science progresses, and breeding biologists 
who get neither their feet nor their hands wet. 
“Tam one ofa small original group who made 
the first leap from the wet world to the in silico 
world to do biology,’ says Marcie McClure, a 
computational biologist at Montana State Uni- 
versity in Bozeman. “I never looked back,” 
During her graduate training, McClure ana- 
lysed a class of viruses known as retroviruses in 
fish, doing the work of a “wet-worlder’, as she 
calls it. Since then, she and her team have dis- 
covered 11 fish retroviruses without touching 
water in lake or lab, by analysing genomes com- 
putationally and in ways that others had not. She 
has also developed software tools to find such 
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viruses in the genomes of other species, includ- 
ing humans. Her work generates terabytes of 
data, which she shares with other researchers. 

Given that big-data analysis in biology is 
incredibly difficult, Hunter says, open science 
is becoming increasingly important. As he 
explains, researchers need to make their data 
available to the scientific community in a use- 
ful form, for others to mine. New science can 
emerge from the analysis of existing data sets: 
McClure generates some of her findings from 
other people’s data. But not everyone recog- 
nizes that kind of biology as an equal. “The 
cultural baggage of biology that privileges data 
generation over all other forms of science is 
holding us back,” says Hunter. 

A number of McClure’s graduate students 
are microbial ecologists, and she teaches them 
how to rethink their findings in the face of so 
many new data. “Before taking my class, none 
of these students would have imagined that 
they could produce new, meaningful knowl- 
edge, and new hypotheses, from existing data, 
not their own,’ she says. Big data in biology 
add to the possibilities for scientists, she says, 
because data sit “under-analysed in databases 
all over the world”. = 


Vivien Marx is technology editor at Nature 
and Nature Methods. 
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Conservation 
in captivity 


Zoos provide an opportunity to work on crucial issues of 
biodiversity while reaching out to the public. 


BY AMANDA MASCARELLI 


Zoo's reproductive-research department 
while she was pursuing her doctorate in 
reproductive physiology in the late 1970s. “I 
wrote to the founder and got a wonderful let- 
ter back saying, “Yes, we're starting this new 


B arbara Durrant heard about San Diego 


research effort here. When you finish your 
PhD, get back in touch with me,” recalls Dur- 
rant. In 1979, she began a two-year postdoc at 
the zoo in California. 

Looking for a second project towards the 
end of her stint, Durrant began collecting 
viable eggs, sperm and embryos from ani- 
mals that had died, and storing them in the 
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facility’s Frozen Zoo, one of the world’s first 
major collections of cryopreserved cells from 
zoo animals. In 1980, she initiated the Germ- 
plasm Repository — a collection of frozen 
reproductive cells from endangered species 
that capture genetic diversity, allowing it to be 
reintroduced into gene pools. In so doing, she 
helped to launch the field of gamete research. 
After her postdoc ended later that year, the zoo 
offered Durrant a permanent research posi- 
tion. Now director of reproductive physiology 
at San Diego Zoo Global, the conservation 
organization that runs the zoo, Durrant heads 
a team that designs reproductive-research 
programmes for rare and endangered spe- 
cies including giant pandas, rhinoceroses 
and Przewalski’s horses. “The greater scien- 
tific community is coming to understand the 
importance of genetic diversity,’ says Durrant. 
“And zoos harbour the greatest genetic diver- 
sity anywhere outside of the natural world.” 
In the past few decades, zoos and aquariums 
around the globe have transformed themselves. 
No longer just family destinations and collec- 
tions of rare, threatened and endangered ani- 
mals, they are also research institutions with 
conservation and science at the core of their 
mission. Zoos are well positioned to manage 
populations of animals whose numbers are rap- 
idly dwindling in their natural habitat, and, in 
some cases, to reintroduce them into the wild. 
And although they have tended to empha- 
size captive-breeding programmes, zoos are 
becoming increasingly focused on field-based 
research and on saving species in the wild. 


CALL OF THE WILD 

Research positions involving conservation at 
zoos and aquariums are still relatively sparse. 
But many scientists find such jobs deeply sat- 
isfying. The research is mission-driven and 
aimed at solving immediate problems, so zoo- 
logical facilities tend to attract scientists who 
embrace an applied approach, says Allison 
Alberts, chief conservation and research 
officer at San Diego Zoo Global. 

“T always thought I was going to end up in 
the traditional academic environment,” says 
Alberts. “I value academic research very much. 
But I wanted to do something more immedi- 
ate. I saw a crisis in the world that needed to 
be addressed now. I felt like, ‘I don’t have the 
luxury to wait and see if my research is going to 
be relevant 30 years from now — I want to be 
doing something that’s solving the conserva- 
tion problem today? And the zoo gave me the 
opportunity to do that? > 
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> Like Durrant, Alberts joined the San 
Diego Zoo as a postdoc, and ended up forg- 
ing her career there. Whereas some positions 
at zoos and aquariums involve only research, 
others may require engaging with the public 
and overseeing staff and projects. In addition 
to coordinating all of San Diego Zoo Global's 
research initiatives in areas such as sustainable 
populations, restoration biology and habitat 
conservation, Alberts helps to raise the funding 
that supports the zoos conservation efforts. She 
misses hands-on research, but says that being 
part of the zoos conservation initiatives pro- 
vides a “whole different type of satisfaction”. 

With one of the largest zoological con- 
servation programmes in the world, the San 
Diego Zoo employs more than 200 research- 
ers, including 12 field-conservation postdocs. 
It has research projects in 38 countries and an 
annual conservation budget of US$15 million, 
of which $6 million comes from grants and 
government contracts, and the rest from dona- 
tions and zoo operations. 

Zoos that employ such large numbers of 
researchers are rare. However, many have 
robust conservation-science programmes; they 
include the Calgary Zoo in Canada, the Smith- 
sonian Institution's National Zoo in Washing- 
ton DC, Antwerp Zoo in Belgium and London 
Zoo. In addition to postdoc positions, research- 
ers may find work as technicians, field and lab 
managers, educators or scientists leading their 
own research programmes at the zoo or in the 
field. Many, including Durrant and Alberts, are 
adjunct or full professors at nearby universities, 
enabling them to mentor students directly and 
to forge collaborations with academic research- 
ers. And scientists with PhDs are sometimes 
employed as curators in a specific area such as 
reptiles or birds. 


ALL CREATURES GREAT AND SMALL 
Although in the past zoos have not tended to 
be seen as research centres, that is changing. 
“Within more traditional academia, I think 
it’s quite easy to dismiss zoos and aquariums 
as a place where you could do real science,” 
says Jackie Ogden, vice-president of animals, 
science and environment at Walt Disney Parks 
and Resorts, who is based at Disney’s Animal 
Kingdom in Orlando, Florida. Ogden says that 
Disney researchers have been involved in more 
than 300 scientific articles in the past 15 years. 
Her team includes 14 PhD students, most of 
them active in conservation research, she says. 
In one project, researchers monitor sea-turtle 
nesting on the central Florida coast in collabo- 
ration with local universities and state wildlife 
agencies. Disney researchers have contributed 
to rehabilitation of more than 350 sea turtles 
over the past 20 years, says Ogden. 
Aquariums have also grown into strong 
conservation-research centres. The Tennessee 
Aquarium Conservation Institute, the 
research arm of the Tennessee Aquarium in 
Chattanooga, is involved in restoration and 
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reintroduction of two imperilled fish spe- 
cies — lake sturgeon and southern Appalachian 
brook trout — to the Tennessee River system. 
Anna George, the institute’s director and chief 
research scientist, says that the job gives her the 
opportunity to put conservation principles into 
practice. With a PhD in conservation genetics of 
freshwater species, she has a deep understand- 
ing of field-based genetic diversity. Her work 
lets her apply that knowledge while collaborat- 
ing with others who have expertise in raising 
fish in captivity. “We can make sure that we're 
really recovering a species with the ability to 
adapt, not just putting individuals into a river, 
she says. 


WALK WITH THE ANIMALS 
As zoos and aquariums become more conser- 
vation-oriented, their research increasingly 
focuses on animals in their natural habitats. 
As a result, opportunities are growing for 
researchers to work with plants and animals 
in the field, says Ron Swaisgood, director of 
applied animal ecology at San Diego Zoo Insti- 
tute for Conservation Research. “Zoos are in 
the process of reinventing themselves,’ he says. 
“People don't think of plant ecology as being a 
zoo research programme — but it is.” 
Swaisgood and others think that such 
jobs, including field research, will continue 
to grow as ZOOS 
become focused on 
conservation, pool- 
ing resources from 
donations and exter- 
nal grants from local, 
state and federal 
regulatory agencies. 
In 2011, facilities 
accredited by the US 
Association of Zoos 
and Aquariums in Sil- 


‘I think it’s ver Spring, Maryland, 
quite easy to spent US$160 million 
dismiss zoosand on 2,670 research and 
aquariums as conservation projects 
aplace where in more than 100 
youcoulddoreal countries, up from 
science.” US$134 million in 
Jackie Ogden 2010, according to 


the association's most 

recent annual report on conservation science. 
In 2009, the European Association of Zoos 
and Aquaria (EAZA), based in Amster- 
dam, estimated that it provides €30 million 
(US$39 million) per year in paid staff time and 
costs for zoological research. It also reported 
that 1,400-1,500 people conduct or facili- 
tate research as part of their jobs in zoos and 
aquariums in Europe. This July, the EAZA will 
launch its own online, open-access publication, 
The Journal of Zoo and Aquarium Research, to 
provide more outlets for zoo-oriented science. 
Research in zoos can be quite different from 
field research, says Lesley Dickie, executive 
director of the EAZA. For instance, she says, 


DISNEY 


SAN DIEGO ZOO 


A researcher from San Diego Zoo in California tracks a koala on St Bees Island, Australia. 


zoo research might focus on animal behav- 
iours that are not seen in the wild because 
they are very hard to observe. But if the 
research concentrates on a highly threatened 
species, sample sizes in both the wild and cap- 
tivity might be very small, making zoo work 
that much more relevant to ‘real-world cir- 
cumstances, and more valuable. “As the wild 
gets more and more pressurized, I think that 
some of the things we're learning about small- 
population management in zoos will be more 
and more applied to the wild,’ says Dickie. 


HUMANS AND OTHER ANIMALS 

Myriad skill sets can open doors to work in 
zoos and aquariums. Basic research in areas 
such as animal behaviour or reproductive 
biology continues to be important, says Dur- 
rant, and training in genetics, wildlife disease 
and conservation education is also valued. It 
is not necessary to have worked with exotic 
animals or in zoos previously, she notes: 
basic-research training with model species 
in universities is sufficient. “Get the strongest 
solid foundation you can get and that you can 
apply to conservation.” 

However, some experience at a zoo or 
aquarium, even as a volunteer, can make the 
transition easier. While doing her PhD at 
Saint Louis University in Missouri, George 
began working in the education department 
at Saint Louis Zoo, leading overnight and 
summer-camp education programmes. That 
experience was key to her being hired at the 
Tennessee Aquarium. “They knew I already 
understood the culture and goals of zoos and 
aquariums and the informal science-educa- 
tion part of that,’ she says. “So even if it’s vol- 
unteering or serving as a keeper, that first step 
into it makes it a lot easier to get a job later” 

Scientists interested in zoo work would 
do well to supplement their training with 
other skills related to conservation. Classes 
in non-profit management and fund-raising 


can help. And George advises that researchers 
get comfortable with outreach, including the 
art of educating donors about their research. 

“We need people who are limber enough 
to move between field and zoo,’ says John 
Fraser, a conservation psychologist who is 
president of the New Knowledge Organi- 
zation, a social-science think tank based in 
New York. “It’s the ability to have a foot in 
both worlds, with the authority of the field 
biologist and the access of the zoo biologist.” 
He suggests pairing a field-biology degree 
with a minor in community organizing, 
organizational psychology or advocacy. 

Regardless of the academic path, the abil- 
ity to work with people — not just animals 
— is crucial. “The outreach I do ranges from 
elementary-school students to politicians 
to journalists and everything in between,” 
says George. “Each programme is different. 
You have to be comfortable being flexible.” 
Rachel Lowry, director of wildlife con- 
servation and science at Zoos Victoria in 
Melbourne, Australia, finds that her most 
profound experiences come from engag- 
ing with audiences and helping to influence 
people's behaviour. “Zoos are really power- 
ful conservation organizations because they 
have an enormous reach, and because they 
are entrusted with these incredible animals 
within their care,’ says Lowry. “To have an 
orang-utan stand behind you while you give 
a talk, and you say, ‘Who here pledges to pur- 
chase only certified sustainable palm oil?’ 
and an orang-utan raises its hand — it’s very 
moving. Everyone standing in front of that 
orang-utan who has come to connect with it 
emotionally suddenly raises their hand and 
says, “Yeah, I don’t want that species to go 
extinct because of the food that I choose’ It’s 
a really powerful role” = 


Amanda Mascarelli is a freelance writer 
based in Denver, Colorado. 
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EMPLOYMENT 


On the job 


US graduate-degree holders aged 30-54 
with a background in life or physical 
sciences had an unemployment rate of 
2.1% and a median salary of US$90,000 in 
2010-11, according to an analysis of census 
data. Hard Times 2013: College Majors, 
Unemployment and Earnings, released 

on 29 May by Georgetown University 
Center on Education and the Workforce 

in Washington DC, found that life- or 
physical-science graduates in the same age 
range with only a bachelor’s degree had 
4.8% unemployment and a median salary 
of $60,000. With research jobs scarce, many 
science-graduate-degree holders work in 
secondary education, or in non-research 
posts in industries such as pharmaceuticals 
or aerospace, notes co-author Anthony 
Carnevale, the centre’s director. 


MEDICINE 


Oncology burnout 


Although 83% of US oncologists report 
career satisfaction, about 45% experience 
emotional exhaustion or other symptoms 
of burnout, says a study presented on 

2 June at the meeting of the American 
Society of Clinical Oncology in Chicago, 
Illinois. The 2012-13 survey of about 1,500 
oncologists found a link between burnout 
and high patient volume. Academic 
oncologists spend more time with patients 
and less on research than in the past, says 
lead author Tait Shanafelt, a haematologist 
and oncologist at the Mayo Clinic in 
Rochester, Minnesota. He suggests that 
early-career academic oncologists need to 
preserve their research time. 


AWARDS 


Prizes for the young 


US researchers under the age of 42 will 

be able to vie for one of three annual 
unrestricted awards of US$250,000 in 

life sciences, chemistry, and physical 
sciences and engineering, the Blavatnik 
Family Foundation in New York and 

the New York Academy of Sciences 
(NYAS) announced on 3 June. “We want 
to highlight young researchers who are 
doing such extraordinary and innovative 
work that it will incentivize other young 
researchers,’ says NYAS president Ellis 
Rubinstein. Nominations from US research 
universities and institutions, national 

labs and academic medical centres will be 
accepted from October to December 2013. 
NYAS council members may nominate 
industry researchers. 
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Uae SCIENCE FICTION 


MORTAR FLOWERS 


BY JESSICA MAY LIN 


ometimes in the morning, a single gull 
would cry, after the mortar shells had 


rained all night and spilt blood trick- 
led down the alley walls into the sunbaked 
asphalt. 

The Cement Florist boiled jars of coloured 
resin in the crumbling kitchen of his third- 
floor apartment, which overlooked 
the warships in the harbour. He 
bit a cigar between his teeth as he 
spooned hot resin out of its jar and 
let it fall back, occasionally glancing 
over his shoulder at the neatly made 
bed with its blue-and-pink-striped 
quilt. 

He had awoken in the middle of 
the night to gunshots in the cul-de- 
sac. Another execution. It was at 
times like these, when he lay alone 
in the dark and the screams ate into 
his mind, that he missed her most. 

Drawing his brown leather jacket 
around his shoulders, he set one of 
his jars under his arm and locked 
the door. 

The alley was filled with cold, 
pale faces. Eyes open, staring life- 
lessly past him at the empty mus- 
tard gas canisters rolling in the 
shadows. The concrete had been blasted 
away by mortar bombs, leaving spiralling, 
blotched scars that decorated the pavement 
like bruises. 

The Cement Florist opened a jar of hot 
yellow resin, honeyed vapours rising out of 
the glass. He slowly poured the contents into 
the whorled contours of a mortar scar. 

Achillea millefolium. The bloodwort 
flower, once used on the battlefield to 
staunch a soldier's bleeding wounds. 


The fires in the Juku Ghetto had finally died, 
taking the rotting tenements with them. 
The Cement Florist stood under the over- 
hang ofa destroyed brothel, carrying his jars. 
The prostitutes glared at him with accusing 
eyes from where they huddled in the ruins, 
neglected, lace garters ripped and nails long. 
Street urchins ducked in the rain, hugging 
to their chests the spokes of a broken chan- 
delier theyd hauled out of the river after last 
night's flood. 


> NATURE.COM This was where 
Follow Futures: he had first met her, 
WY @NatureFutures when he sold flowers 
Ei go.naturecom/mtoodm © out of his rusted truck 
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The art of remembering. 


to the working men, for their sweethearts. 
Things had been different then. Lovers 
walked with their heads up, and children 
didn’t fight each other with sticks. There 
hadn't been the pasty smell of ashes, which 
drifted down on the city like snowflakes. 


Hil 


He stepped into the charred street, his face 
streaked with rain and tears, and fell to his 
knees. He filled the concrete scars with blue 
resin, for the urchins’ dirty scarves, wrapped 
around defunct mortar shells that they'd 
painted into dolls. 

Myosotis scorpioides. The true forget-me- 
not, for children from whom the war had 
cruelly robbed their innocence, shivering in 
the cold and forgotten. 


He used to walk with her in the Hanging 
Gardens, which now hung limp, brown and 
wilted from the mustard gas. Waffle crumbs 
still littered the marble walkways where 
young lovers once walked through the dap- 
pled sunlight, licking ice cream cones. 

He’ read a story in the newspaper last 
week about a boy and a girl who tried to 
escape on the long bridge that led from the 
gardens out of the city. The snipers found 
them before they could taste freedom. Their 
bodies still lay entwined in the dust, where 
nobody had bothered to retrieve them. 
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in Nh Mi rine 


Standing in the dry shadows of limp, dead 
ivy, the Cement Florist wondered about 
what could’ve been — if he had taken her 
hand and run. If they would be together 
right now, in this life or the next. 

He sighed. 

Apple pies and warm Saturday mornings 
sipping coffee in bed, watching sailing-boats 
race in the harbour. All these things had 
been stolen from his fingertips. 

Eventually he set the jar of white resin 
down on the ground, and filled the 
blotched concrete. 

Asphodelis aestivus. The summer 
asphodel, the flower of the underworld. 
They say that in the Silent Meadow — 

the place where all lovers are eventu- 
ally united — it grows in soft fields, 
slowly bending in a nonexistent 
breeze. That’s where he would meet 
her. 


Back in his apartment, the Cement 
Florist sat down on the edge of the 
bed. He lifted the quilt with his 
blistered hands and breathed in her 
warm, lavender scent. 

The hands of the brass clock hang- 
ing over the sink moved onto the hour. 

He looked at the door. 
They came when he had known they 
would. 

The gloved men with cold faces, who car- 
ried rifles and ordered him to come outside 
into the street. 

He followed them in silence, and 
thought of white flowers in a sweet-scented 
field, when they drew a knife across his 
neck and lay him down on the pavement 
to bleed. 

His heartbeat was the last thing he heard, 
the sun warm against his skin, when he 
exhaled for the last time into the musty 
evening air. 

His blood swirled into the concrete scar 
where a mortar shell had fallen that morn- 
ing — a bright, flowing red. 

Protea cynaroides. The king protea, the 
oldest flower in the world. One of great 
strength and courage — for a man who 
devoted himself to changing suffering into 
art, to making beauty where it no longer 
exists, even if no one will ever see it. = 


Jessica May Lin is a student at the 
University of California, Berkeley. Her 
work is also forthcoming in Daily Science 
Fiction. Her website is jessicamaylin.com. 
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BRIEF COMMUNICATIONS ARISING 


Properties of native brain a-synuclein 


ARISING FROM T. Bartels, J. G. Choi & D. J. Selkoe Nature 477, 107-110 (2011) 


a-Synuclein is an abundant presynaptic protein that binds to nega- 
tively charged phospholipids'’, functions as a SNARE-complex 
chaperone’ and contributes to Parkinson’s disease pathogenesis*”. 
Recombinant «-synuclein in solution is largely unfolded and devoid 
of tertiary structure*”’, but Bartels et al.’* have proposed that native 
a-synuclein purified from human erythrocytes forms a stably folded, 
soluble tetramer that resists aggregation. By contrast, we show here 
that native «-synuclein purified from mouse brain consists of a largely 
unstructured monomer, exhibits no stable tetramer formation, and is 
prone to aggregation. The native state of «-synuclein is important for 
understanding its pathological effects as a stably folded protein would be 
much less prone to aggregation than a conformationally labile protein. 
There is a Reply to this Brief Communication Arising by Bartels, T. & 
Selkoe, D. J. Nature 498, http://dx.doi.org/10.1038/nature12126 (2013). 
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Figure 1 | Recombinant g-synuclein and brain a-synuclein in cytosol are 
monomeric. a, b, Immunoblotting analysis of mouse brain homogenate 
(input), cytosol and membranes (a), and quantification of protein levels 

(b; means = s.e.m.; 1 = 3)’. c, Native mouse brain &-synuclein (375 jug) elutes as 
an apparent tetramer during gel filtration on a Superdex 200 column (top), as 
analysed by o-synuclein immunoblotting (bottom). mAU, milli absorbance 
unit. d, Analysis of purified recombinant myc-epitope-tagged o-synuclein (rec. 
myc-o-syn) by SDS-PAGE and immunoblotting. e, Molecular mass calibration 
curve for gel filtration (Rp = migration distance of proteins versus total running 
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We examined native «-synuclein from brain, the most relevant 
organ for understanding neurodegeneration. Separation of mouse 
brain homogenates into soluble and membrane fractions revealed that 
during ultracentrifugation, most %-synuclein partitioned into cytosol 
fractions similar to complexins, but different from membrane proteins 
such as cysteine string protein (CSP)-« and SNAP25 (Fig. la, b). Using 
gel filtration, we analysed the size of native «-synuclein in brain cytosol 
and of recombinant myc-epitope-tagged human «-synuclein, purified 
without boiling or detergents’. Both o-synucleins eluted in a single 
peak with an apparent molecular mass of ~63 kDa (Fig. 1c-f), close 
to that predicted for a folded tetramer’. 

These results seem to confirm that o-synuclein forms a stable tetra- 
mer in solution. However, dynamic or unstructured states of a protein 
may increase its hydrodynamic radius and apparent molecular mass 
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distance; y axis = logarithm of molecular protein mass (Mr)). f, Recombinant 
myc-tagged human «-synuclein (16 1g) also elutes as an apparent tetramer 
during gel filtration. g, Circular dichroism spectroscopy shows that recombinant 
a-synuclein (10 jig) is unstructured in solution and becomes «-helical upon 
liposome binding. PC, phosphatidylcholine; PS, phosphatidylserine. Molar 
protein-to-lipid ratio, 1:530; 0=molar ellipticity. h, Recombinant (0.5 jig) and 
a-synuclein in brain cytosol (12 1g total protein) run as apparent tetramers on 
blue native gels without boiling or after boiling for 5 min. i, SEC-MALS reveals 
that recombinant «-synuclein (0.5 mg) is monomeric. 


©2013 Macmillan Publishers Limited. All rights reserved 


BRIEF COMMUNICATIONS ARISING 


during gel filtration. Indeed, circular dichroism spectroscopy showed 
that recombinant «-synuclein lacked detectable secondary structure, 
but became «&-helical upon membrane binding (Fig. 1g). Consistent 
with the gel-filtration analysis, both native and recombinant o-synuclein 
migrated as a single band of ~65 kDa on blue native gels. Notably, 
however, both recombinant and native o-synuclein still migrated at 
that apparent size after boiling, which disrupts secondary and tertiary 
structures, arguing against a folded multimer (Fig. 1h). Furthermore, 
size-exclusion chromatography coupled with multi-angle laser-light 
scattering (SEC-MALS) revealed that recombinant o-synuclein was 
monomeric (Fig. li). As native «-synuclein in brain cytosol and recom- 
binant «-synuclein behave identically in gel filtration and blue native 
gel-electrophoresis experiments, the SEC-MALS demonstration that 
recombinant o-synuclein is monomeric suggests that native brain 
a.-synuclein in cytosol is also monomeric. 

We next tested whether native brain «-synuclein is still monomeric 
even when purified. We purified «-synuclein from mouse brain with- 
out detergents or denaturing conditions (purity >90%; Fig. 2a). Mass 
spectrometry showed that native brain «-synuclein was substantially 
larger than predicted (measured mass, 16,408 + 894 Da (n = 3); pre- 
dicted mass, 14,485 Da). The increased mass is partly due to amino- 
terminal acetylation of brain «-synuclein’*’* (Fig. 2b). SEC-MALS 
revealed that freshly purified native %-synuclein was again predomi- 
nantly monomeric (Fig. 2c). We also observed a plateau along the left 
shoulder of the main SEC-MALS peak with a mass of ~58 kDa that 


a a-Synuclein purification b 


Mass spec: 


contained little detectable «-synuclein (<5% by immunoblotting), 
and whose observed molecular mass is inconsistent with a putative 
tetramer. Circular dichroism spectroscopy showed a largely random- 
coil conformation (34-59%) with o-helical contributions (21-24%; 
Fig. 2d). Purified «-synuclein aggregated in a time-dependent manner, 
with a relative increase in overall secondary structure as observed by 
circular dichroism spectroscopy (Fig. 2d), and the appearance of larger 
particles as uncovered by dynamic light scattering (Fig. 2e). 

Our data show that native brain «-synuclein primarily consists of 
an unstructured monomer, but readily aggregates in a time-dependent 
manner. This conclusion was demonstrated both for unpurified 
a.-synuclein as a component of brain cytosol (Fig. 1), and for purified 
a-synuclein in solution (Fig. 2c). Purified brain «-synuclein — ana- 
lysed here for the first time — carries significant post-translational 
modifications (Fig. 2b), which do not, however, seem to alter its folding, 
as the biophysical properties of recombinant unmodified «-synuclein 
and native modified «%-synuclein were similar (Figs 1 and 2). The dif- 
ferences between our results with brain «%-synuclein and those obtained 
with erythrocyte «-synuclein’* may be due to erythrocyte-specific 
post-translational modifications, or to time-dependent multimeri- 
zation/aggregation of erythrocyte o-synuclein that may have been 
overlooked. Indeed, the circular dichroism spectrum of erythrocyte 
a-synuclein”’ is similar to that of purified brain o:-synuclein after 75 h 
incubation (Fig. 2d). Independent of which explanation will account 
for the differences in results obtained with brain and erythrocyte 
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Figure 2 | Purified native brain o-synuclein is predominantly an 
unstructured monomer that aggregates in a time-dependent manner. 

a, SDS-PAGE analysis of five stages of «-synuclein purification from mouse 
brain. IEX, anion exchange chromatography; HIC, hydrophobic interaction 
chromatography. Purified «-synuclein was also analysed by immunoblotting 
and mass spectrometry as shown. b, Mass spectrometry analysis reveals 
N-terminal acetylation of native «-synuclein. Shown is an extracted ion 
chromatogram of the N-terminally acetylated «-synuclein peptide. Inset, 
tandem MS spectrum containing the sequence of the N-terminal peptide and 
identified b and y ions. c, SEC-MALS shows that purified brain «-synuclein 


(150 1g) is largely monomeric (main peak with a mass of 17 + 1 kDa), but 
includes a minor component (plateau along the left shoulder with a mass of 
58 + 5 kDa) that contains little detectable «-synuclein (see immunoblot in 
boxed region). Calculated masses were extracted from marked areas. d, Circular 
dichroism spectroscopy of freshly purified brain «-synuclein (0.12 g per 1 = 
7.5 uM) shows mainly disordered conformations that progressively acquire 
structured conformations as a result of time- and temperature-dependent 
aggregation. RT, room temperature. e, Purified brain «-synuclein 

(0.12 mgml ') rapidly aggregates as measured by dynamic light scattering 
immediately (0h) or 152h after purification. 
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a.-synuclein, the conformationally labile state of native brain «-synuclein 
documented here provides a potential explanation for why u-synuclein 
is susceptible to pathological aggregation as observed in multiple neu- 
rodegenerative disorders*”. 


Methods 


a-Synuclein was purified from mouse brain cytosol (obtained from brain homo- 
genates by ultracentrifugation at 280,000g,,) by sequential chromatography on Q 
sepharose (elution at 0.3-0.5 M NaCl, 20 mM Tris-HCl, pH 7.4), phenyl sephar- 
ose (flow-through in 1 M (NH4)2SO,) and Superdex-200 10/300GL. SEC-MALS 
was performed on a WTC-030S5 column (Heleos OptiLab instruments, Wyatt 
Technology). Circular dichroism spectra were measured in 25% PBS on an Aviv 
CD Spectrometer and deconvolved (http://dichroweb.cryst.bbk.ac.uk/html/ 
home.shtml) with Contin-4 and -7 reference sets. Mass spectrometry was per- 
formed on purified o-synuclein or o-synuclein-containing gel pieces digested 
with Glu-C and Protease Max (Promega, using standard procedures)"*. All other 
methods have been described previously’. 
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REPLYING TO J. Burré et a/. Nature 498, http://dx.doi.org/10.1038/naturel2125 (2013) 


In disagreeing with our report that native «-synuclein occurs physio- 
logically as an o.-helically folded tetramer in neural and erythroid cells’, 
Burré et al. conclude instead that ‘native brain o-synuclein’ consists ofa 
largely unstructured monomer. They make two implications about our 
paper that are inaccurate: (1) that our findings pertained only to eryth- 
rocyte o-synuclein (we reported multiple experiments on neural cells); 
and (2) that we concluded that cellular «-synuclein is a stable tetramer 
under all conditions (we did not use the term ‘stable’, and we observed 
monomers and some other oligomers in normal cells (e.g., Fig. 1d of 
ref. 1)). Indeed, we emphasized the need to discover “compounds that 
... could kinetically stabilize native tetramers and prevent pathogenic 
a.-synuclein aggregation”. Although the data in our report suggest that 
tetramers are the predominant native species, tetramers and other oli- 
gomers arise from monomers, so there must be an equilibrium between 
monomeric and oligomeric forms in cells. Pathogenic events (e.g., 
mutations) could alter this equilibrium, and some therapeutic com- 
pounds could potentially re-establish it, as we explicitly suggested’. 
Most findings in Fig. 1 of Burré et al.” confirm previous reports 
(including ours’) that recombinant o-synuclein is an unfolded mono- 
mer of ~14 kDa but migrates anomalously at ~60 kDa in gel filtration 
(their Fig. 1f), presumably owing to the large hydrodynamic radius of 
an extended monomer. We had stated that this made “gel filtration an 
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unreliable indicator [of mass] and therefore [it was] not used here”?. 


That recombinant «-synuclein becomes «-helical upon binding phos- 
pholipid vesicles (their Fig. 1g) was also long known’ and observed by 
us’. The key difference from our work regards their data on the folding 
and assembly state of native o-synuclein (their Fig. 2). We believe 
these data are less in disagreement with our conclusions than the 
authors suggest. First, they show by size-exclusion chromatography 
coupled with multi-angle laser-light scattering (SEC-MALS) the 
existence of small amounts of «-synuclein tetramer (58.5 kDa) in their 
natively purified brain preparation (their Fig. 2c). Then, their Fig. 2d 
shows circular dichroism spectra of purified brain «-synuclein that dis- 
play a mixture of unfolded (34-59%) and «-helically folded (21-24%) 
protein, a clear structural difference from recombinant «-synuclein, 
which is all unfolded (their Fig. 1g, “‘buffer’). Their findings are not 
entirely incompatible with our paper, as we had stated that helical tetra- 
mers were the predominant physiological species but variable amounts 
of monomers and other oligomers were observed". 

Given that even the helically folded tetramer suggested by us' (and 
others*”) contains only about 50% helical structure (as the regions around 
amino acid 50 form structured loops and the carboxy terminus is con- 
formationally mobile), the fact that their circular dichroism spectrum 
contains ~24% helical conformation suggests that up to half of their 
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brain «-synuclein sample is folded and the other half is unfolded. The 
latter result raises the possibility of either differences in tetramer:mono- 
mer equilibria between their (murine brain) and our (human erythrocyte 
or neuroblastoma) samples or a partial denaturation of the brain sample 
during purification. Interestingly, room-temperature incubation of their 
unfolded monomeric/partly folded tetrameric sample led to overall loss 
of circular dichroism spectral intensity (by ~50%), probably due to 
aggregation and precipitation of some of the protein out of solution, 
and a relative increase in helical content of the protein remaining in 
solution (their Fig. 2d, green). The authors correctly indicate that now 
their spectrum of purified brain o:-synuclein is similar to our spectrum of 
purified erythrocyte o-synuclein. They say this conversion indicates that 
“purified a-synuclein aggregated in a time-dependent manner, with a 
relative increase in secondary structure”, but using the term ‘aggregation’ 
for this helical change is different from the widely studied pathogenic 
aggregation of o-synuclein that involves a conversion to a -sheet-rich 
structure. It was the latter type of aggregation that we showed native 
a-synuclein to be resistant to (Fig. 3d of ref. 1). Burré et. a? only observed 
loss of «-helical structure after heating brain «-synuclein to 95 °C (their 
Fig. 2d, orange), a condition that similarly led to denaturation of our 
purified «%-synuclein helical tetramers (Supplementary Fig. 11 of ref. 1) 
and thus does not disprove our conclusion that native helical «-synuclein 
does not readily aggregate under physiological conditions. 

The loss of overall circular dichroism signal accompanied by an 
increase in o-helical spectral components that Burré et al.* show in 
Fig. 2d could be interpreted in two ways: (1) some sample precipita- 
tion occurs, and at the same time the remaining soluble %-synuclein 
becomes increasingly «-helically folded (such an event could be inter- 
preted as the refolding of a partially denatured protein); or (2) the 
monomeric, unfolded portion of the mixture (their Fig. 2c) aggregates 
and precipitates out of solution (their Fig. 2e), whereas the helically 
folded, apparently tetrameric component (their Fig. 2c) stays unaltered 
in solution and provides the circular dichroism signal. The latter inter- 
pretation would be consistent with our hypothesis that destabilization 
of helical tetramers into unfolded monomers in cells may precede 
pathological «-synuclein aggregation’. In summary, the difference 
between their purified brain «-synuclein and our purified erythrocyte 
and neuroblastoma «-synuclein seems to be the relative abundance of 
the aggregation-resistant helical material at the time of initial analysis. 

Even though the dynamic light scattering data of Burré et al.’ in 
Fig. 2e imply an increasing amount of aggregates (in agreement with 
the partial precipitation suggested in their Fig. 2d), no conclusion 


about the amount of remaining monomers/tetramers in the sample 
can be drawn from this, given the inability of dynamic light scattering 
to detect small particles if sufficient amounts of large particles are 
present in the mixture. 

Collectively, the data of Burré et al.” show the existence of some 
helically folded, apparently tetrameric (58.5 kDa) protein in purified 
a-synuclein isolated from normal mouse brain, although in their 
hands, this constitutes only half (by their Fig. 2d) or a minor portion 
(by their Fig. 2c) of their total protein immediately after purification 
and only becomes the major species upon incubation over time (their 
Fig. 2d). Given that the two studies are therefore debating the relative 
proportion under native conditions of helically folded tetramers, not 
their existence per se, we believe it is reasonable to pursue attempts to 
stabilize helically folded native «%-synuclein tetramers as an approach 
to reducing the pathological aggregation of monomers. In light of our 
findings in Bartels et al.’ and in an extensive a-synuclein crosslinking 
analysis in intact neurons and other cells®, the combined recent data 
support the hypothesis that physiological «-synuclein occurs in cells 
in an oligomeric (principally tetrameric) state in the cytosol’*** which 
is in equilibrium with unfolded monomers. 

This Reply is written by two out of three of the authors from the 
original paper’. J. G. Choi left the laboratory for Graduate School in 
2011. 
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Conserved regulatory elements in AMPK 


ARISING FROM B. Xiao et al. Nature 472, 230-233 (2011) 


The AMP-activated protein kinase (AMPK), an «By heterotrimeric 
enzyme, has a central role in regulating cellular metabolism and 
energy homeostasis’. The «-subunit of AMPK possesses the catalytic 
kinase domain, followed by a regulatory region comprising the auto- 
inhibitory domain (AID) and «-linker*’. Structural and biochemical 
studies suggested that AID is central to mammalian AMPK regulation’; 
however, this notion has been challenged recently by Xiao et al. on the 
basis of their active AMPK structure (Protein Data Bank accession 
2Y94)°. On close inspection, however, we found that the o-subunit 
regulatory region was incorrectly built in their model, and our rebuilt 
model suggests a universal occurrence of the AID domain in AMPKs; 
we have also identified a novel regulatory motif that is essential for 
AMPK regulation. 

The AID domain from Schizosaccharomyces pombe AMPK-like 
kinase folds into a non-canonical UBA conformation comprising 
three a-helices*. However, Xiao et al.° reported in their structure of 
an active AMPK containing the kinase domain and flanking regulatory 
region that the AID region was found to be disordered. In addition, 
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their bioinformatic analysis suggested that the AID region in vertebrate 
AMPKs may adopt a four helical conformation. Thus, they proposed 
that the vertebrate AMPK does not contain an AID domain as in the 
yeast AMPK. We determined the crystal and NMR structures of two 
AID fragments from rat «1 and human «2 subunits, respectively, 
showing that each of the AID domains comprises only three a-helices 
stabilized by highly conserved hydrophobic residues (Fig. 1a, b). We 
found that the overall structures of the isolated mammalian AIDs share 
a markedly similar conformation to that of the yeast orthologue, indi- 
cating that AID is highly conserved throughout evolution from yeast to 
mammal. 

AMPK is characterized by its ability to be regulated by binding of 
adenine nucleotides to the y-subunit®’. One important feature of the 
structure of Xiao et al.° is that a small «-subunit segment («1 373-382, 
termed the «-hook) was modelled to interact with the y-subunit 
(Fig. 1a). However, His 376 of the o-subunit («His 376) is unfavourably 
nested in a positive cave and the adjacent phosphorylated Thr 377 
(pThr377) is positioned in a negative pocket. Thus, the potentially 
mis-represented AID structure together with the possibly incorrect 
structure of the %-hook prompted us to re-examine the crystallographic 
data of the assignment of Xiao et al.°. The o-hook was previously back- 
traced from the «-subunit carboxy-terminal domain; however, there is 
a discontinuity of electron density between residues «Asn 382 and 
aAla394. We thus rebuilt the model, in that the strong side-chain 
feature is reassigned as wArg 363 (previously assigned to pThr 377) 
and «Glu 362 takes the place of «His 376 (Fig. 1c, d). The remaining 
residues of the o-regulatory region were manually built, and the amino 
acid sequence of the AID helix «3 and the following «-linker is off- 
registered by more than 10 amino acids relative to the previous structure 
(Fig. 1a). Notably, the two helical regions following the kinase domain 
are respectively reassigned, corresponding to AID helices «1 and «3, 
and they adopt similar conformation in isolated AID and trimeric 
AMPK structures (Appendix Fig. 1). The orientation of AID in the 
rebuilt model is consistent with our conformational switch model for 


Figure 1 |Conserved AID and a-RIM for AMPK allosteric activation. 

a, Sequence alignment of the «-subunit regulatory region from mammalian 
AMPKkKs. The newly identified %-RIM is boxed in magenta, and previous «-hook 
in cyan. Secondary structural elements in the predicted (yellow) and isolated 
(marine blue) AID structures and those in the active AMPK determined 
previously (green) and here (blue) are shown above the alignment. Key 
interacting residues in our &-RIM and previous &-hook are indicated by red and 
blue asterisks, respectively. Shown at the bottom is the schematic diagram of the 
aBy heterotrimeric AMPK: kinase domain, light blue; AID, blue; «-linker, pink; 
a-RIM, magenta; o-hook, cyan; x-CTD, brown; B-CTD, green; B-loop, dark 
green; y-subunit, light orange. CTD, C-terminal domain. b, Crystal structure of 
rat %1-AID (marine blue) and solution structure of human «2-AID (purple). 
The two isolated AID structures are superposed to the AID domain (light blue) 
in the S. pombe kinase domain-AID structure. The conserved hydrophobic core 
residues of rat %1-AID are highlighted as cyan sticks. c, Graphical 
representations of the rebuilt active AMPK. The colour scheme follows the 
schematic diagram in a, with the two AMP molecules at sites 3 and 4 
highlighted as green sticks. The rat x1-AID (marine blue ribbon) is superposed 
to the putative AID helix «1 and «3. d, SA-omit map (contoured at 2.0c) for the 
oat-RIM (c-hook). Residues of the rebuilt and previous models are shown in 
magenta sticks and bluish green lines, respectively. e, The a-RIM has significant 
roles in allosteric activation of AMPK. The kinase assay was performed in the 
presence of 6.6 nM phosphorylated AMPK holoenzyme (wild type or mutants), 
200 uM SAMS, 1mM ATP and 10mM MgCh, with or without addition of 
200 uM AMP (mean and s.e.m., 1 = 3). 
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AMPK regulation**: AMP binding to the y-subunit might transmit, via 
the o-linker, onto the AID, release AID from the kinase domain and 
ultimately activate AMPK. Thus, we consider that the AID domain is 
largely folded in the active AMPK, whereas the previously identified 
oa-hook is disordered in the rebuilt model (Fig. 1c). 

Instead of the «-hook, we identified a novel «-subunit motif (#1 
358-368) termed regulatory-subunit-interacting motif, «-RIM (Fig. Ic). 
The wArg 363 penetrates into the pocket formed by a newly observed 
B-loop (human $2 219-235) that was disordered in any other core 
structure®’. The y-subunit contains four potential nucleotide-binding 
sites, of which sites 3 and 4 are important for AMPK allosteric stimu- 
lation by AMP*’. The acidic «Glu 362 interacts with the key y-subunit 
residues Arg69 and Lys 169 that are also in contact with the AMP 
molecule bound at site 3. Notably, these two important interacting 
residues in the «-RIM are highly conserved in mammals. To assess 
their significance, we generated point mutations on rat «1B1y1 trimer 
and examined their effects on AMPK allosteric activation*"*. The wild- 
type holoenzyme, as well as the reported «-hook mutant «RTDE4A°, 
was activated about twofold by AMP (Fig. le). Mutating the two charged 
a-RIM residues «Glu 362 and Arg 363 individually or simultaneously 
to Ala (#E362A, «R363A and “E362A/R363A) largely abolished the 
AMP-dependent activation. Similarly, the B-loop deletion mutant 
(AB-loop(218-228) on rat B1) no longer responded to AMP concen- 
tration change. These data clearly demonstrate that the %-RIM in the 
proximity of site 3 has essential roles in AMPK allosteric regulation. 

In summary, we have provided evidence to reaffirm our earlier 
report* that AID is universally present in AMPK, and hereby rectify 
the misinterpretation of the AID structure and the o-hook feature by 
Xiao et al.°. Moreover, the essential role of the «-RIM (instead of the 
previously defined, disordered «-hook) for AMPK allosteric regulation 
is strongly supported by its interaction with the crucial y-subunit site 3 
and mutagenesis data. 


Methods 


The crystal of rat «1-AID (284-336) was obtained with a reservoir solution 
containing 0.1 M HEPES, pH 7.5, 0.3 M magnesium sulphate, 36% isopropanol 
(v/v) at 4°C. The NMR structure of human «2-AID (282-339) was determined 
with '°N and °N/’°C proteins. The active AMPK structure was rebuilt with the 
diffraction data (2Y94.cif) downloaded from Protein Data Bank. The structural 
statistics are summarized in Appendix Tables 1-3, respectively. The site-specific 
mutations were generated by overlap PCR procedure and verified by DNA 
sequencing. AMPK was phosphorylated by CaMKKf and the activity was deter- 
mined with SAMS peptide using a coupled assay. 
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Appendix 


Appendix Figure 1 | SA-omit map (contoured at 2.0¢) for the AID region. 
The colour scheme of the rebuilt model follows that in Fig. 1c, and the isolated 
rat #1-AID is shown in marine blue. Two residues from AID helices «1 and «3, 
Asp 290 and Ile 333, are shown as stick representation. Upon superposition, 
helix «2 in the isolated AID is located between the y-subunit CBS1 and B-CTD 
from a symmetry molecule. The loop between AID helices «2 and «3 slightly 
clashes with a loop from the symmetry B-CTD, but such loop regions can be 
easily accommodated with very minor adjustments in their conformations. 
Notably, there are discontinuous densities for the AID helix «2 in this omit 
map. Therefore, we think that the AID helix «2 in the active AMPK may not be 
fully disordered, but rather adopt slightly different conformations, which 
results in the poor electron density. 


Appendix Table 1| Data collection and refinement statistics for rat a1-AID 
crystal structure 


Rat «1-AID (residues 284-336) 


Data collection* 


Space group P222, 
Cell dimensions 
a, b,c (A) 22.1, 36.3, 98.8 
a, B,y (°), 90, 90, 90 
Resolution (A) 50.0-1.5 (1.53-1.50)+ 
Rmerge (%) 4.6 (26.3) 
I/ol 36.4 (3.6) 
Completeness (%) 99.5 (98.5) 
Redundancy 3.6.3.1) 
Refinement 
Resolution (A) 244-15 
No. reflections 24,447 
Rwork/Reree 21.2/18.2 
No. atoms 
Protein 524 
Water 79 
B-factors (Average) 22.52 
Protein 20.43 
Water 36.37 
rm.s.d. 
Bond lengths (A) 0.005 
Bond angles (°) 0.902 


r.m.s.d., root mean square deviation. 
*The data set was collected from single crystal. 
+ Highest resolution shell is shown in parenthesis. 
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Appendix Table 2 | Experimental and structural statistics for the ensembles 
of 20 NMR structures of human o2-AID 


Human «2-AlD (residues 282-338) 


Distance constraints 


Intra-residue (|i-j| = 0) 587 
Sequential (|i-/| = 1) 314 
Medium (2 = |i-j| =4) 277 
Long-range (|i-j| =5) 190 
Ambiguous 379 
Total 1,747 
Dihedral angle constraints 
0) 42 
WV 42 
Total 84 
Hydrogen bond constraints 15 
Structure statistics (20 structures) 
Violation statistics , 
OE violation (>O0.3A) 0 
Maximum NOE violation (A) 0.16 
Torsion angle violation (>5°) 0 
Energy 
Mean AMBER energy (kcal mol *) =2,230.8 
Mean bond energy 33.7 
Mean angle 122.5 
Mean dihedral 550.7 
Mean VDW —423.3 
Ramachandran plot analysis 
ost favoured regions (%) 90.1 
Additional allowed regions (%) 8.7 
Generously allowed regions (%) 0.8 
Disallowed regions (%) 0.4 
r.m.s.d. from mean structure*+ 
Backbone atoms (A) 0.52 20.117 
All heavy atoms (A) ; 1.06 + 0.10% 
Regular secondary structures (A)*+ 
Backbone atoms (A) 0.31 20.07% 
All heavy atoms (A) 1.05 +0.12¢ 
NOE, nuclear overhauser effect; VDW, van der Waals. 
* The average r.m.s.d. between the 20 structures of the lowest AMBER energies and the mean 


coordinates (+ standard deviation). 
+ Calculated with PROCHECK_NMR. 
{Residues 12-56 in AID were used in the calculation. 


Appendix Table 3 | Refinement statistics for the active AMPK structure 
using diffraction data (2Y94.cif) downloaded from PDB 


Heterotrimeric AMPK 
(rat «1, human 2, rat 71) 


Refinement 
Resolution (A) 29.53-3.24 
No. reflections 19,619 
Rwort/Rires 24.70/28.40 
No. atoms 6,605 
Protein 6,524 
Ligand/ion 81 
B-factors (Average) 133.29 
rm.s.d. : 
Bond lengths (A) 0.014 
Bond angles (°) 0.652 
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